View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0000885aMuleFeature Requestpublic2006-05-06 19:362008-07-09 16:13
Reporterpcmaster 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityalways
StatusclosedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0000885: Sorting Unicode Characters
DescriptionI'm writes a function that is capable of ordering words in alphabetical order. It works with both ANSI and UFT-8 encodings.

Is not exhaustively proved but it seems to work. It's just an old library wruited by me in Turbo Pascal years ago in order to sort ANSI and ASCII characters, without ASCII support and with UTF support added.
Additional InformationYou just have to use the AlfaComp function in order to compare UTF or ANSI strings, in this form:

int AlfaComp (char *a, char *b, char juego, char func)


(s and t are char *, "juego" can be "UTF8" or "ANSI" that are constants defined in the library to indicate the encoding in use, and there are three functions available).

Examples:

AlfaComp (s,t,UTF8,MAYOR_QUE)
returns 1 if s has to be ordered after t, and 0 if not.

AlfaComp (s,t,ANSI,MENOR_QUE)
returns 1 if s has to be ordered before t, and 0 if not.

AlfaComp (s,t,UTF8,IGUAL_QUE) returns 1 is both string are equal (the program supposes the letters "á" and "a" are same. If you don't like this, just use the standard strcmp function in string.h

Note: The letter ç is an special case. It just can be ordered as a c, but if two words only differs in this letter, the ç can be ordered AFTER the c. because of this, this words are correctly in order:

placa
plaça
plaçada
placar
plaçar
placard

The function takes this in consideration and returns the correct value. Note: this is using the MAYOR_QUE and the MENOR_QUE functions, the IGUAL_QUE function always returns thesse two words are different.

Other interesting functions in the library:

void QuitarAcentos (char *s, char *e, char juego)

copy the e string in the s string with substitution of accentued letters. For example, á, and à can be changed by an a. juego can be UTF8 or ANSI

void Mayuscula (char *s, char juego)
converts a string first letter to uppercase

void Minuscula (char *s, char juego)
converts a string first letter to lowercase

void MaysCadena (char *s, char juego)
converts an entire string to uppercase

void MinsCadena (char *s, char juego)
converts an entire string to lowercase

char MeteChar (char *s, int c, char juego)
inserts the c character (c is an int containing the chacacter code) in the string s, and returns the number of bytes used by the character in the string (always 1 for ANSI, 1 to 4 for UTF-8).

int utf24bit (char *s, char juego)
reads a character from s and returns it as a int. s can be a 1 byte ANSI character or a 1 to 4 byte UTF-8 character.

int ValidaUTF (char *s)
Tests if s can be a valid UTF-8 sequence of bytes. (For example, an UTF string can not have two bytes >192 one just after the other.

char NumBytes (char *s, char juego)
reads the first byte of a character and returns number of bytes it has to have if it is a valid UTF-8 sequence. For example, if an utf character has the first byte > 240 the next three bytes can be 128-191.

size_t UTF8long (char *s, char juego)
returns the number of letters in a UTF-8 or ANSI string.

The library also defines constant names for characters >128.

Note: is my first useful C program. If you find any bug, please send me a message. Thanks.
TagsNo tags attached.
Fixed in Revision
Operating SystemAny
Attached Filesc file icon alfaint.c [^] (9,375 bytes) 2006-05-06 19:37
c file icon ejemplo.c [^] (1,910 bytes) 2006-05-06 19:39
txt file icon gples.txt [^] (22,474 bytes) 2006-05-06 20:13 [Show Content]
c file icon alfaint.c [^] (9,321 bytes) 2006-05-06 20:49
c file icon alphaint.c [^] (8,725 bytes) 2006-05-08 15:24
c file icon test.c [^] (2,056 bytes) 2006-05-08 15:24
txt file icon gpl.txt [^] (19,941 bytes) 2006-05-08 15:30 [Show Content]
c file icon alphaint.c [^] (6,039 bytes) 2006-11-06 21:36
? file icon alphaint.h [^] (4,264 bytes) 2006-11-06 21:37
c file icon test.c [^] (2,346 bytes) 2006-11-06 21:38
? file icon compile.sh [^] (213 bytes) 2006-11-06 21:39

- Relationships

-  Notes
(0001972)
pcmaster (reporter)
2006-05-06 19:39

See bug 472 :)
(0001973)
pcmaster (reporter)
2006-05-06 20:50

Updated alfaint.c. Deleted unnecesary ; characters and using constants in one function.
(0001974)
Kry (manager)
2006-05-07 22:38

If you want it to be included in aMule distribution so it can be used, you gotta change all that spanish words to english :)
(0001975)
pcmaster (reporter)
2006-05-08 00:19

Translating...
(0001976)
pcmaster (reporter)
2006-05-08 15:21
edited on: 2006-05-08 15:25

Translation finished :)

Function names changed:

AlfaComp -> int AlphaComp (char *a, char *b, char encoding, char cond)

QuitarAcentos -> void EraseAccents (char *s, char *e, char encoding)

Mayuscula -> void UpperCase (char *s, char encoding)

Minuscula -> void LowerCase (char *s, char encoding)

MaysCadena -> void StringToUpper (char *s, char encoding)

MinsCadena -> void StringToLower (char *s, char encoding)

metechar-> char PutChar (char *s, int c, char encoding)

ValidaUTF -> int UTFvalid (char *s)

and a few number of internal functios.

Also constant names are changed, because this names are mnemonics for the letter and the symbol in it: accent, tilde, etc. And the filenames are also changed.

edited on: 05-08-06 15:25
(0002148)
pcmaster (reporter)
2006-11-06 21:40

Update: Bugs in EQUAL_TO function in alphaint.c and in test.c corrected. Added a header file and a compile script to test the program.

- Issue History
Date Modified Username Field Change
2006-05-06 19:36 pcmaster New Issue
2006-05-06 19:36 pcmaster Operating System => Any
2006-05-06 19:37 pcmaster File Added: alfaint.c
2006-05-06 19:39 pcmaster File Added: ejemplo.c
2006-05-06 19:39 pcmaster Note Added: 0001972
2006-05-06 20:13 pcmaster File Added: gples.txt
2006-05-06 20:49 pcmaster File Added: alfaint.c
2006-05-06 20:50 pcmaster Note Added: 0001973
2006-05-07 22:38 Kry Note Added: 0001974
2006-05-08 00:19 pcmaster Note Added: 0001975
2006-05-08 15:21 pcmaster Note Added: 0001976
2006-05-08 15:24 pcmaster File Added: alphaint.c
2006-05-08 15:24 pcmaster File Added: test.c
2006-05-08 15:25 pcmaster Note Edited: 0001976
2006-05-08 15:30 pcmaster File Added: gpl.txt
2006-11-06 21:36 pcmaster File Added: alphaint.c
2006-11-06 21:37 pcmaster File Added: alphaint.h
2006-11-06 21:38 pcmaster File Added: test.c
2006-11-06 21:39 pcmaster File Added: compile.sh
2006-11-06 21:40 pcmaster Note Added: 0002148
2008-07-09 16:13 Wuischke Status new => closed


Copyright © 2000 - 2024 MantisBT Team
Powered by Mantis Bugtracker