My native language is not English, is Portuguese-BR and we have these accentuated characters (á, à, ã, õ, and so on).
So, my problem is, if I put one of these characters inside a string, and I try to iterate over each character inside it, I'm going to get that two characters are necessary to display "ã" on the screen.
Here's an image about me iterating over a string "(Não Informado)", which means: Uninformed. The string should have a length of 15 if we count each character one by one. But if we call strlen("(Não Informado)");
, the result is 16.
The code I used to print each character in this image is this one:
void print_buffer (const char * buffer) {
int size = strlen(buffer);
printf("BUFFER: %s / %i\n", buffer, size);
for (int i = 0; buffer[i] != '\0'; ++i) {
printf("[%i]: %i\n", i, (unsigned char) buffer[i]);
}
}
So, in graphical applications, a buffer could display "ãbc", and inside the raw string we wouldn't have 3 characters, but actually 4.
So here's my question, is there a way to know which characters inside a string are a composition of those special characters? Is there a rule to design and restrict this occurrence? Is it always a composition of 2 characters? Could a special character be composed of 3 or 4, for example?
Thanks
Aucun commentaire:
Enregistrer un commentaire