Navigation

    Fuze Arena Logo
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Popular
    • Users
    • Groups
    • Help
    • Discord

    What sort of characters can be stored in a string?

    Help
    3
    5
    271
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Discostew
      Discostew F last edited by

      So I'm working on my MML player, but first I'm making a parser to validate the MML strings. Initially I was just going to parse it for validation, then during playback, I'd process the strings as they are, but then I got to thinking that I could reduce playback processing by taking the different elements of the MML string, and reducing them to key values. In other words, if I have different elements like "@E" (for Envelope), "@ER" (for deactivating an Envelope), "V" (for Volume), etc, instead of comparing string characters, I have those converted to numbers ("@E" becomes 1, "@ER" becomes 2, "V" becomes 3, etc), then I convert those numbers to characters to be placed in the string instead, so during playback, I simply read a character, convert to a number, and process that way. Now, I assume converting the number 0 to a character is out of the question because typically, a value of 0 is a NULL character, and ends a string.

      Other than than, so long as I'm not printing the string, can I safely store any converted number to a character? Are strings stored as 2-bytes per character (unicode)? That would help because if it were 1-byte each, then that would cause an issue. I want to have at least 256 values in a row to work with.

      1 Reply Last reply Reply Quote 0
      • Willpowered
        Willpowered Fuze Team last edited by

        Having the character '0' in a string is no problem at all! Don't worry about null terminators, that's all taken care of internally. Strings are stored as one byte per character currently.

        1 Reply Last reply Reply Quote 0
        • Discostew
          Discostew F last edited by

          Hmm, just did some testing. Using chr(0) to force a NULL character between two strings seems to do nothing, like it is dismissed. I imagine this is done on purpose? I imagine this is one reason why we can't make assignments to individual elements in a string (as trying so causes an error).

          Anyways, I began testing string length, and it would seem it isn't quite one-byte per character. If I use chr to convert individual numbers, I get the following lengths...

          0 = 0
          1 ~ 127 = 1
          128 ~ 2047 = 2
          2048 ~ 65535 = 3
          65536 ~ 1114111 = 4
          1114111 ~ .... = 0

          Anyways, at least this gives me something to work with. I can encode my MML commands as single-byte characters (127 is enough for that), then use two-byte characters for their parameters when needed (which can then be reduce to a range of 0 ~ whatever by subtracting the value by 128).

          1 Reply Last reply Reply Quote 0
          • Willpowered
            Willpowered Fuze Team last edited by

            Sorry for the confusion- If you're using the chr function to work with characters, the result is encoded in UTF-8, which uses 1-4 bytes to store information. chrVal will decode them from UTF-8 back to an integer.

            1 Reply Last reply Reply Quote 1
            • 12Me21
              12Me21 last edited by 12Me21

              From what I can tell, certain characters are not possible to obtain in strings, due to chr outputting utf-8:

              0-127 (0xxxxxxx) - direct from chr
              128-191 (10xxxxxx) - 2nd/3rd/4th byte of multi-byte chr (>127)
              192-223 (110xxxxx) - 1st byte of 11 bit chr
              224-239 (1110xxxx) - 1st byte of 16 bit chr
              240-247 (11110xxx) - 1st byte of 21 bit chr
              248-255 (11111xxx) - unobtainable
              

              And even if you can create these chars, you won't be able to read their values very easily, because chrVal will probably fail with invalid UTF-8 data.
              You should probably just use an array.

              1 Reply Last reply Reply Quote 5
              • First post
                Last post