I have an idea that might make writing your parser a lot easier (and would make it a lot more capable).
Ever heard of lex / yacc? (or their modern counterparts, flex and bison, i'll use these names to describe them herein). They're often used to create the text parsing sections of compilers.
You write source for the flex / bison engines and they generate C(++) code that makes up the actual parser.
It works in this way:
- flex is known as a 'lexical analyser', what it does is read the input (in your case it would be the script source) and parses it into 'tokens' by using regular expressions.
- bison is a parser generator, which uses the tokens that come from the flex part of your program to build up 'sentences' which your code can then use and make sense out of.
So to put it this way, i'll take an example from the example script that comes with the Scripter:
Flex would generate code that slices this string up into different pieces. for example, you might have it slice it up like so:
4 'tokens' in this string:
- A 'keyword' token, in this case "SET"
- A 'variable' token, in this case "MAP"
- An 'operator' token, "="
- A 'value' token, "eden04.map"
This output which is broken up is then parsed by Bison generated code that makes sense out of it. Bison uses a 'grammar file' as input to generate its parser, which might tell it a rule like this:
Read the next line in the file. Look for a keyword token. This keyword token must be followed by, in this order, a variable token, an operator token, and then a value token. Then, send each piece of data to the core of the program where it knows what to do.
(of course, it is not written in english, it uses what's called an "LALR"
The flex/bison files are interspersed with C(++) code which then actually does the work, for example in the last instance you might have it call a function with a prototype like this:
void handleLine(char *keyword, char *var, char *operator, char *value);
And it might get called like this, in the preceding example:
handleLine("SET", "MAP", "=", "eden04.map");
It could get more powerful than this if you start defining generic rules so you can insert things like 'expressions' on the right hand side of the operator. That way it would make expressions like
SET MAP=ConvertToString(GetVariable(VAR_MAPNAME))
if you wanted, just like C++ supports.
If you're interested, look at
http://www.codeproject.com/cpp/introlexyacc.asp