One of the issues I’ve been struggling with in building MCsla is that the mgx tool that is used to compile a DSL into M doesn’t output valid M in all cases.
A couple of the issues are simply unsolvable (I mentioned these in my last blog post as well).
- If your grammar includes any Boolean values, they come out as ‘ true’ and ‘false’ (including the single quotes), when they should come out as true and false (no quotes).
- If your grammar includes any lists of values, a single-item list generates invalid M, while multi-item lists generate valid M (though I’m not sure you can describe the result using an MSchema – but at least the result is “valid”).
As I say, these problems are “unsolvable” because the M rendering API is what creates the bad result, and that’s a black box.
The solution, such as it is, requires taking the text output, the M code, and doing some text processing to fix the issues. In the olden days I’d have whipped up an awk script to do this, but these days C# is the tool (even if it is a lot more verbose). Here’s the code I’m using to fix the two issues – kind of a post-processor to the mgx tool:
public static class MFix
{
/// <summary>
/// Fix single-quote and single-list-item
/// issues with a generated M file.
/// </summary>
/// <param name="fileName">Path to the file to fix</param>
public static void FixFile(string fileName)
{
var output = new System.Text.StringBuilder();
File.ReadAllLines(fileName).ToList().ForEach((line) =>
{
output.AppendLine(FixQuotes(FixSingleNode(line)));
});
File.WriteAllText(fileName, output.ToString());
}
/// <summary>
/// Fix single-quote issue with M generated by
/// mgx.exe.
/// </summary>
/// <param name="input">One line of text input.</param>
/// <returns>Fixed line of text from input.</returns>
private static string FixQuotes(string input)
{
var sb = new System.Text.StringBuilder(input.Length);
int state = 0;
for (int i = 0; i < input.Length; i++)
{
var sub = input.Substring(i, 1);
if (sub == "\"")
{
if (state == 0)
state = 1;
else
state = 0;
sb.Append(sub);
}
else if (sub == "\\" && state == 1)
{
sb.Append(sub);
if (i < input.Length - 1)
{
i++;
sb.Append(input.Substring(i, 1));
}
}
else if (sub == "'" && state == 0)
{
// do nothing and eat next char if it is a ' '
i++;
if (input.Substring(i, 1) != " ")
i--;
}
else
sb.Append(sub);
}
return sb.ToString();
}
/// <summary>
/// Fix single element array issue with M generated by
/// mgx.exe.
/// </summary>
/// <param name="input">One line of text input.</param>
/// <returns>Fixed line of text from input.</returns>
private static string FixSingleNode(string input)
{
var trm = input.TrimStart();
if (!trm.StartsWith("=")) return input;
var pos = input.IndexOf("=");
var sb = new System.Text.StringBuilder(input.Length);
sb.Append(input.Substring(0, pos));
sb.Append(" {");
sb.Append(input.Substring(pos + 1, input.Length - pos - 1));
sb.Append(" } ");
return sb.ToString();
}
}
It is kind of ironic that I had to write a subset of an M parser to fix issues with the M generator API.