MSX character set -> Unicode HELP NEEDED

Page 5/5
1 | 2 | 3 | 4 |

By gdx

Prophet (2978)

gdx's picture

08-10-2019, 01:12

What do you mean?
I quickly split it into two pages and added some tables, the texts may need to be corrected.

By Manuel

Ascended (15686)

Manuel's picture

09-10-2019, 12:11

There are several issues...

1. the "MSX Characters" article doesn't show all MSX characters. Instead, some control codes in the tables are covering up the characters. Examples are the character at 0x7F and the characters of Arabic character sets. This is a good example of a character set Wiki article: https://en.wikipedia.org/wiki/Code_page_437 I hope someone would like to make this for the MSX character set as well (the Wikipedia article is quite bad and has also these issues with control codes)
2. the "MSX font" article does show the full character sets, but only for a selection of machines. And it's a bit odd, because most of these fonts are the same, just displaying different character sets. The interesting part is what different shapes were used for the same character (that is the whole point of a font). For example, there are variations between Japanese machines (see first page of this forum thread) and also between some Brazillian machines (really different font style). The style of the "usual" Western unaccented characters are the same everywhere, as far as I know.

By Manuel

Ascended (15686)

Manuel's picture

09-10-2019, 12:37

wbahnassi wrote:

Alright, I've reviewed all Arabic files. There were some inaccuracies that I fixed, and I've added all the possible mappings to Unicode. Please note that all possibilities are equally correct and important. You can't assume there is a "main" one and the others are duplicates.

Rebecca asked me to thank you for fixing the Arabic mappings for her.

As for the Korean and Japanese mappings: any help is mostly appreciated!

By gdx

Prophet (2978)

gdx's picture

09-10-2019, 13:33

Quote:

1. the "MSX Characters" article doesn't show all MSX characters.

"MSX Characters codes" is intended to show the printable characters and how, but also the differences between the MSXs.

Quote:

2. the "MSX font" article does show the full character sets, but only for a selection of machines.

"MSX fonts" is to show where the characters are in memory, but also the differences between the MSXs.

It takes time to do that and I find less and less. So do not hesitate to add what is missing. Otherwise it will be incomplete for a long time.

By Manuel

Ascended (15686)

Manuel's picture

09-10-2019, 14:33

Well, I don't agree with this set up. "MSX Character Codes" should show the character sets (including differences between machines). Right now it doesn't, a lot of characters are missing.

The method to print these characters with the BIOS is a nice extra; it's what Rebecca calls the "transaction" mapping. But the fundamental mapping is the character number and the symbol belonging to that number. The mapping as shown in the MSX Technical Databook. The data that is also in the font part of the MSX ROM, indexed by that same number. That is what is defined as the 'character set'.
Obviously, a 'listing' of the character set should show all possible characters comprising that character set. If there are characters that can't be printed with the BIOS, that doesn't mean they do not exist. They're still part of the character set.

As I said before, a 'font' is the representation of the characters.

To illustrate the difference: it makes sense to make a mapping from MSX Character Sets to Unicode. Unicode is another way to assign a number to a glyph. The MSX Character Set is the MSX way to assign a number to a glyph.
But fonts are how a glyph is represented. You can have the same character set with different fonts. It means that you will get the same characters when putting a set codes in VRAM, just rendered in a different style.
But if there are differences in character sets, you get a different character. Font isn't even relevant then, the whole character is different.

There are different character sets used on the MSX and that's what this whole thread is about. As I went along, I noticed also some slightly different styles in font.

By gdx

Prophet (2978)

gdx's picture

09-10-2019, 17:29

Whether you agree or not, MSX work like this. And now any characters are missing. (except for arabic that is not finished because I do not know all the codes.) The mapping as shown in the MSX Technical Databook is in "MSX fonts". The two pages are complementary.

I wonder if you really saw the pages to say all that.

By wbahnassi

Rookie (21)

wbahnassi's picture

09-10-2019, 19:53

Manuel wrote:

Rebecca asked me to thank you for fixing the Arabic mappings for her.

You are both welcome Smile And thanks for your effort in all of this!

Page 5/5
1 | 2 | 3 | 4 |