This example is from our website backend. We had existing code for parsing a source file and generating HTML with syntax highlighting. Before adding some extra functionality like collapsible blocks, we wanted to split it into 2 parts: parsing part that produces a list of token objects, and formatting part that generates HTML. Here is the prompt:
Create nested classes TokenWrapper and LineWrapper with int Number and List<> Tokens.
TokenWrapper should privately wrap original token, have public string Value and nested enum DiffState{Normal, Added, Removed} State.
DumpTokens() should first build a list of wrapped lines, then dump them all.
Original
Claude 3.7 Sonnet
LLaMA-3.3-70B
LLaMA-4-Scout
GPT-4.1-mini
Model: | Claude 3.7 Sonnet |
Time: | 21s planning + 15s coding |
Prompt tokens: | 5.1K |
Response tokens: | 893 |
Model: | LLaMA3.3-70B (Cerebras) |
Time taken: | 0.5s planning + 1s coding |
Prompt tokens: | 1.5K |
Response tokens: | 502 |
Model: | LLaMA-4-Scout-17B-16E-Instruct (Cerebras) |
Time taken: | 0.5s planning + 0.4s coding |
Prompt tokens: | 1.2K |
Response tokens: | 552 |
Model: | GPT-4.1-mini |
Time taken: | 7s planning + 16s coding |
Prompt tokens: | 1.1K |
Response tokens: | 648 |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var result = new StringBuilder();
UniversalTokenizer.TokenType tokenType = 0;
int contextFlags = 0;
foreach (var line in lines)
{
var tokens = tokenizer.TokenizeLine(line, ref tokenType, contextFlags);
foreach (var token in tokens)
{
string cssClass = TranslateTokenType(token.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.GetText(line)}</span>");
else
result.Append(token.GetText(line));
}
result.AppendLine();
}
return result.ToString();
}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
20
21
22
23
24
25
26
27
28
33
34
35
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class TokenWrapper
{
public enum DiffState
{
Normal,
Added,
Removed
}{...}
private readonly UniversalTokenizer.Token _token;
private readonly string _text;
public string Value => _text;
public DiffState State { get; set; } = DiffState.Normal;
public UniversalTokenizer.TokenType Type => _token.Type;
public TokenWrapper(UniversalTokenizer.Token token, string text)
{
_token = token;
_text = text;
}{...}
}{...}
public class LineWrapper
{
public int Number { get; set; }
public List<TokenWrapper> Tokens { get; set; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var result = new StringBuilder();
var wrappedLines = new List<LineWrapper>();
UniversalTokenizer.TokenType tokenType = 0;
int contextFlags = 0;
for (int lineNumber = 0; lineNumber < lines.Length; lineNumber++)
{
var line = lines[lineNumber];
var tokens = tokenizer.TokenizeLine(line, ref tokenType, contextFlags);
var wrappedTokens = new List<TokenWrapper>();
foreach (var token in tokens)
{
var wrappedToken = new TokenWrapper(token, token.GetText(line));
wrappedTokens.Add(wrappedToken);
}
wrappedLines.Add(new LineWrapper(lineNumber + 1, wrappedTokens));
}
foreach (var wrappedLine in wrappedLines)
{
foreach (var token in wrappedLine.Tokens)
{
string cssClass = TranslateTokenType(token.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
24
25
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class TokenWrapper
{
private UniversalTokenizer.Token _token;
public string Value => _token.GetText("");
public enum DiffState { Normal, Added, Removed };
public DiffState State;
public TokenWrapper(UniversalTokenizer.Token token)
{
_token = token;
}{...}
}{...}
public class LineWrapper
{
public int Number;
public List<TokenWrapper> Tokens;
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
for (int i = 0; i < lines.Length; i++)
{
UniversalTokenizer.TokenType tokenType = 0;
int contextFlags = 0;
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
var wrappedTokens = tokens.Select(token => new TokenWrapper(token)).ToList();
wrappedLines.Add(new LineWrapper(i + 1, wrappedTokens));
}
var result = new StringBuilder();
foreach (var wrappedLine in wrappedLines)
{
foreach (var token in wrappedLine.Tokens);
foreach (var token in tokens)
{
string cssClass = TranslateTokenType(token._token.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
70
71
72
73
74
75
76
77
78
79
80
81
82
83
89
90
91
92
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
UniversalTokenizer.TokenType tokenType = 0;
int contextFlags = 0;
for (int i = 0; i < lines.Length; i++)
{
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
wrappedLines.Add(new LineWrapper(i + 1, tokens.Select(t => new TokenWrapper(t, lines[i])).ToList()));
}
var result = new StringBuilder();
foreach (var line in wrappedLines)
{
foreach (var token in line.Tokens)
{
string cssClass = TranslateTokenType(token.WrappedToken.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
public class LineWrapper
{
public int Number { get; }
public List<TokenWrapper> Tokens { get; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public class TokenWrapper
{
private UniversalTokenizer.Token _wrappedToken;
private string _line;
public string Value => _wrappedToken.GetText(_line);
public enum DiffState { Normal, Added, Removed }
public DiffState State { get; set; }
public UniversalTokenizer.Token WrappedToken => _wrappedToken;
public TokenWrapper(UniversalTokenizer.Token wrappedToken, string line)
{
_wrappedToken = wrappedToken;
_line = line;
State = DiffState.Normal;
}{...}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
78
79
80
81
87
88
89
100
101
102
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
UniversalTokenizer.TokenType tokenType = 0;
int contextFlags = 0;
for (int lineIndex = 0; lineIndex < lines.Length; lineIndex++)
{
var line = lines[lineIndex];
var tokens = tokenizer.TokenizeLine(line, ref tokenType, contextFlags);
var tokenWrappers = new List<TokenWrapper>();
foreach (var token in tokens)
tokenWrappers.Add(new TokenWrapper(token, line));
wrappedLines.Add(new LineWrapper(lineIndex, tokenWrappers));
}
var result = new StringBuilder();
foreach (var wrappedLine in wrappedLines)
{
foreach (var token in wrappedLine.Tokens);
{
string cssClass = TranslateTokenType(token.InternalToken.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
public class TokenWrapper
{
internal readonly UniversalTokenizer.Token InternalToken;
public readonly string Value;
public enum DiffState
{
Normal,
Added,
Removed
}{...}
public DiffState State;
public TokenWrapper(UniversalTokenizer.Token token, string line)
{
InternalToken = token;
Value = token.GetText(line);
State = DiffState.Normal;
}{...}
}{...}
public class LineWrapper
{
public readonly int Number;
public readonly List<TokenWrapper> Tokens;
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
}{...}
}{...}
Here we used the planning mode to automatically ask each model what other code it wanted to see, then pull it into the context window, and re-run the original request with it.
Here comes the interesting part: on some of the runs, the models would put the newly generated code before the modified method, and the actually confused some models.
Here’s how it works: models generate new text based on prior text by assigning different importance to different parts of it. By the time LLaMA-3 produced “public string Value =>“, it was facing a choice:
- Assign higher importance to the partially generated class. Value would use some existing fields.
- Assign higher importance to instructions saying “public string value”, infer that it should have a backing field, and reference a not-yet-existing field.
One could guess that the most training data had backing fields before the properties, so LLaMA preferred a broken _token.GetValue(“”) to a backing field that was not yet defined.
Claude Sonnet made the right decision earlier, when it output the backing field before the property. But it took 35 seconds as opposed for 1.5s for LLaMA via Cerebras.
In the other run, LLaMA 4 produced the modified version first. By the time it started outputting the class, the constructor call for it was already generated, so it knew it needs a backing field of some kind.
Here’s another interesting part: when writing the constructor call LLaMA4 passed the entire original line there because it made sense. When generating the class, it made the backing field for the entire line as well, because, well, 1-to-1 mapping of backing fields to constructor arguments is common.
Getting better results
One way to get better results is to use larger models, or more detailed prompts. Another option is to quickly iterate edits with very brief prompts. Let’s deliberately make LLaMA 3 fail by removing most context and picking the most broken output version, and then see what will it take to fix it:
Original
Take 1
Fix Errors
Think and Fix
+value to ctor
+token.Type
Prompt: | (initial) |
Time taken: | 0.9 seconds |
Prompt tokens: | 1.4K |
Response tokens: | 562 |
Prompt: | Fix errors |
Time taken: | 1.9 seconds |
Prompt spent: | 2K |
Response tokens: | 702 |
Prompt: | (see below) |
Time taken: | 0.9 seconds |
Prompt spent: | 2.1K |
Response tokens: | 638 |
Prompt: | +value to ctor |
Time taken: | 0.8 seconds |
Prompt spent: | 2K |
Response tokens: | 567 |
Prompt: | +token.Type |
Time taken: | 1.0 seconds |
Prompt spent: | 2.6K |
Response tokens: | 589 |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var result = new StringBuilder();
UniversalTokenizer.TokenType tokenType = 0;
int contextFlags = 0;
foreach (var line in lines)
{
var tokens = tokenizer.TokenizeLine(line, ref tokenType, contextFlags);
foreach (var token in tokens)
{
string cssClass = TranslateTokenType(token.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.GetText(line)}</span>");
else
result.Append(token.GetText(line));
}
result.AppendLine();
}
return result.ToString();
}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
23
24
25
26
27
28
29
30
31
32
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class LineWrapper
{
public int Number { get; }
public List<TokenWrapper> Tokens { get; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public class TokenWrapper
{
private readonly UniversalTokenizer.Token _originalToken;
public string Value { get; }
public DiffState State { get; }
public enum DiffState { Normal, Added, Removed }
public TokenWrapper(UniversalTokenizer.Token originalToken, DiffState state)
{
_originalToken = originalToken;
Value = _originalToken.GetText();
State = state;
}{...}
}{...}
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
var tokenType = UniversalTokenizer.TokenType.Invalid;
var contextFlags = 0;
for (var i = 0; i < lines.Length; i++)
foreach (var line in lines)
{
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
var wrappedTokens = tokens.Select(t => new TokenWrapper(t, TokenWrapper.DiffState.Normal)).ToList();
wrappedLines.Add(new LineWrapper(i + 1, wrappedTokens));
}
var result = new StringBuilder();
foreach (var line in wrappedLines)
{
foreach (var token in line.Tokens)
{
string cssClass = TranslateTokenType(token.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
22
23
24
25
26
27
28
29
30
31
32
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class LineWrapper
{
public int Number { get; }
public List<TokenWrapper> Tokens { get; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public class TokenWrapper
{
private readonly UniversalTokenizer.Token _originalToken;
public string Value { get; }
public DiffState State { get; }
public enum DiffState { Normal, Added, Removed }
public TokenWrapper(UniversalTokenizer.Token originalToken, DiffState state)
{
_originalToken = originalToken;
Value = _originalToken.GetText();
State = state;
}{...}
}{...}class LineWrapper, class TokenWrapper
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
var tokenType = UniversalTokenizer.TokenType.Invalid;
var contextFlags = 0;
for (var i = 0; i < lines.Length; i++)
{
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
var wrappedTokens = tokens.Select(t => new TokenWrapper(t, TokenWrapper.DiffState.Normal)).ToList();
wrappedLines.Add(new LineWrapper(i + 1, wrappedTokens));
}
var result = new StringBuilder();
foreach (var line in wrappedLines)
{
var localTokenType = UniversalTokenizer.TokenType.Invalid;
foreach (var token in line.Tokens)
{
string originalLine = lines[line.Number - 1];
string cssClass = TranslateTokenType(token._originalToken.Type);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
23
24
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class LineWrapper
{
public int Number { get; }
public List<TokenWrapper> Tokens { get; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public class TokenWrapper
{
private readonly UniversalTokenizer.Token _originalToken;
private readonly string _line;
public string Value { get; }
public DiffState State { get; }
public enum DiffState { Normal, Added, Removed }Value + 2 other unchanged items
public TokenWrapper(UniversalTokenizer.Token originalToken, string line, DiffState state)
{
_originalToken = originalToken;
_line = line;
Value = _originalToken.GetText(_line);
State = state;
}{...}
}{...}
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
var tokenType = UniversalTokenizer.TokenType.Invalid;
var contextFlags = 0;
for (var i = 0; i < lines.Length; i++)
{
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
var wrappedTokens = tokens.Select(t => new TokenWrapper(t, lines[i], TokenWrapper.DiffState.Normal)).ToList();
wrappedLines.Add(new LineWrapper(i + 1, wrappedTokens));
}
var result = new StringBuilder();
foreach (var line in wrappedLines)
{
foreach (var token in line.Tokens)
{
string cssClass = TranslateTokenType(tokenType);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
23
24
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class LineWrapper
{
public int Number { get; }
public List<TokenWrapper> Tokens { get; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public class TokenWrapper
{
private readonly UniversalTokenizer.Token _originalToken;
public string Value { get; }
public DiffState State { get; }
public enum DiffState { Normal, Added, Removed }_originalToken + 3 other unchanged items
public TokenWrapper(UniversalTokenizer.Token originalToken, string value, DiffState state)
{
_originalToken = originalToken;
Value = value;
State = state;
}{...}
}{...}
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
var tokenType = UniversalTokenizer.TokenType.Invalid;
var contextFlags = 0;
for (var i = 0; i < lines.Length; i++)
{
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
var wrappedTokens = tokens.Select(t => new TokenWrapper(t, t.GetText(lines[i]), TokenWrapper.DiffState.Normal)).ToList();
wrappedLines.Add(new LineWrapper(i + 1, wrappedTokens));
}
var result = new StringBuilder();
foreach (var line in wrappedLines)
{
foreach (var token in line.Tokens)
{
string cssClass = TranslateTokenType(tokenType);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
1
2
3
4
5
6
7
8
9
10
11
12
23
24
25
26
28
29
30
31
32
33
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
using SysprogsDevTools.Tokenizer;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Sysprogs.IDE.Framework.Languages.DiffExport
{
public class ExporterDemo
{
public class LineWrapper
{
public int Number { get; }
public List<TokenWrapper> Tokens { get; }
public LineWrapper(int number, List<TokenWrapper> tokens)
{
Number = number;
Tokens = tokens;
}{...}
}{...}
public class TokenWrapper
{
private readonly UniversalTokenizer.Token _originalToken;
public string Value { get; }_originalToken, Value
public UniversalTokenizer.TokenType OriginalType { get; }
public DiffState State { get; }
public enum DiffState { Normal, Added, Removed }
public TokenWrapper(UniversalTokenizer.Token originalToken, string value, DiffState state)
{
_originalToken = originalToken;
Value = value;
OriginalType = originalToken.Type;
State = state;
}{...}
}{...}
public static string DumpTokens(string text, UniversalTokenizer tokenizer)
{
var lines = text.Split('\n');
var wrappedLines = new List<LineWrapper>();
var tokenType = UniversalTokenizer.TokenType.Invalid;
var contextFlags = 0;
for (var i = 0; i < lines.Length; i++)
{
var tokens = tokenizer.TokenizeLine(lines[i], ref tokenType, contextFlags);
var wrappedTokens = tokens.Select(t => new TokenWrapper(t, t.GetText(lines[i]), TokenWrapper.DiffState.Normal)).ToList();
wrappedLines.Add(new LineWrapper(i + 1, wrappedTokens));
}
var result = new StringBuilder();
foreach (var line in wrappedLines)
{
foreach (var token in line.Tokens)
{
string cssClass = TranslateTokenType(token.OriginalType);
if (cssClass != null)
result.Append($"<span class=\"{cssClass}\">{token.Value}</span>");
else
result.Append(token.Value);
}
result.AppendLine();
}
return result.ToString();
}{...}
private static string TranslateTokenType(UniversalTokenizer.TokenType tokenType)
{
switch (tokenType)
{
case UniversalTokenizer.TokenType.Keyword:
return "keyword";
case UniversalTokenizer.TokenType.CharacterLiteral:
case UniversalTokenizer.TokenType.StringLiteral:
return "literal";
case UniversalTokenizer.TokenType.Comment:
return "comment";
default:
return null;
}
}{...}
}{...}
}{...}
The first version here is broken in 2 ways: it doesn’t pass anything to GetText(), and uses the last token type when outputting all tokens as HTML.
Asking the model to just “fix errors” got it in a correct direction (wrong token type), but then it got lost: created an unnecessary localTokenType variable, and then try accessing a private field. So, let’s roll back that step in CodeVROOM and copy-paste a detailed, but generic prompt:
this version causes a build error in the following line:
Value = _originalToken.GetText();
write 3 most likely causes of it, considering that the original code worked fine, then write the fixed version.
write the entire modified file.
This time the model doesn’t see the definition of Token and has to guess it from the original version of the code. So, it makes a few not-so-useful guesses, before finally arriving at the correct conclusion:
1. The `GetText` method is not a part of the `UniversalTokenizer.Token` struct, but it was available in the original code as an extension method or a method in the same class.
2. The `GetText` method has been removed or renamed from the `UniversalTokenizer.Token` struct or its extension methods.
3. The `GetText` method requires an additional parameter (like the line text) which is missing in the current call.
Here is the fixed version of the modified file, assuming that the `GetText` method requires the line text as a parameter:
This also ended up storing a copy of the entire line with each token, because the decision to add a new backing field was made before writing the actual constructor.
Giving very specific, but concise prompts works like a charm though: “+value to ctor” followed by “+token.Type” both hit the bull’s eye, and only take about a second each.
Naming
When creating the new code, the models tend to use the average naming style from the training set, but you can override it with sticky guidelines. Just write plain-text instructions on what naming to use when, and they will be appended to each editing request as a separate Markdown section.
Conclusion
If your software design includes auxiliary wrappers for moving the data around, AI can create those very fast based on fairly short prompts. The result won’t be perfect, but you can quickly reshape it to your linking with 2-3-word prompts.