My Syntax leads to Nova using 100% CPU; how can I debug this?

I’m trying to write a syntax for the Zig language. This is what I have so far (gist, because it’s quite long):

It works fine on this input:

const @"e" = "32";
var b = 42;
var c = "ae" and hurr;
const foo = struct {
  const bar = union {
    garr, barr
  };
  
  foo: u32,
};

fn myfunction(arr: u32, knarr: bool) bool {
  var foo = "bar";
}

test "foo" {
  var foo = bar;
  var bao = far;
  var c = d;
  const a = struct {
    foo: bar,
  };
}

However, when I replacevar foo = bar; in test "foo", with var foo = bar{};, syntax highlighting stops working and Nova starts using 100% CPU. I figure Nova runs into some kind of endless loop while applying the syntax, but I cannot figure out what goes wrong. In particular, I disabled the portion of the syntax that would parse the {} suffix here, and the error did not go away.

The behavior of the editor is also curios:

Screenshot 2021-10-18 at 16.49.18

For some reason it highlights foo, the succeeding space, and the final semicolon. I have no idea why this happens but I feel that this might be related to the problem I run into. This doesn’t happen after adding {} because then the editor stops doing this kind of highlight alltogether.

I am aware that the syntax is quite long so I do not expect someone to tell me what specifically is wrong with it. What I am asking for is how to debug this syntax and whether there may be a Nova bug that leads to the full CPU usage. The extension console does not help me as it doesn’t show how the syntax is applied. An option that details the parser descending/navigating the sublevels of the syntax would be extremely helpful.

What would be helpful would be for you to post what syntax scope Novas assigns to your examples. You can find that out by using the Scope Inspector (that little crosshairs icon in the status bar; just activate it and move the pointer over the relevant areas). Be detailed, i.e. check what each highlighted area is scoped as both in the working and non-working examples and in any variant you try. With that info, I could take a peek … a 600+ lines syntax file is not that daunting.

Neat, I didn’t know that!

I get the following scopes from the scope inspector for the line

var foo = bar;
  • var: zig.keyword.construct; zig.var-decl; zig.test-decl
  • foo: zig.identifier; zig.identifier.variable.name; zig.var-decl; zig.test-decl
  • foo (with space): zig.identifier.variable.name; zig.var-decl; zig.test-decl
  • = bar: zig.assignment; zig.var-decl; zig.test-decl
  • bar: zig.complex-expression; zig.assignemnt; zig.var-decl; zig.test-decl
  • whole line: zig.var-decl; zig.test-decl

As soon as I add {} after bar (or even just {), the inspector doesn’t work anymore.

This is still a problem. I figured out the error also happens by appending () or , to bar. I have re-read my definitions a couple of times but still can’t figure out what’s wrong. Could you have a look?

Hi, I am currently struggling with some health issues which affect both my availability and my ability to focus, so I cannot make any promises. Sorry!

No worries, your health obviously takes priority. Take your time and get well.

Hi, @flyx . I’ve started experimenting with Zig and was thinking of creating a syntax definition for it, but it looks like you’ve made a good deal of progress already.

I have zero experience creating these things and I’ve found the documentation quite daunting, but I can poke at what you have and see if I can stumble across a solution. Is the code in that Gist still the latest code you have for this project? If not, would you mind publishing the current state of your project somewhere? Thanks in advance.

@Albright The gist is where I left it. You are free to expand on it, treat it as MIT-licensed.

I am currently waiting for an announcement following the discussion of the future of the parsing engine in Nova and hope for TreeSitter to be adopted. I wouldn’t want to put any effort in the grammar right now when a potential switch of engines is on the horizon.

If I had to guess I’d say there is a regular expression somewhere that is too greedy and keeps looping over and over. I like regex101 for testing regexes against text samples.

To debug it I would remove your whole syntax and add it back in chunk by chunk. You could add back in the first half by itself, then the second half by itself, and see which half is bad. Then keep the good half and add back in the next quarter, next quarter, and so on, until you narrow it down. Or just test one scope at a time to see if each one works independently.

Also have a test code sample that you can test against, which contains examples of all of the language’s structures, plus examples with incorrect syntax to test that the syntax doesn’t hang when incorrect language syntax is used.

I understand this sentiment to some degree, but from what I understand, there hasn’t been a decision in this regard yet, much less actual progress in terms of code, so I’m guessing this change is still at least a year out.

For what it’s worth, I’ve found, after a good deal of trial and error, that after commenting out line 47, the parser is able to parse both a simple “hello world” and a substantially more complex file without any issues. I have no idea why yet, but at least it’s hopefully some progress.

Panic posted on twitter last month that there’s already a beta that integrates the Tree Sitter parsing library Nova × Improved Parser Beta

1 Like

Oh, cool. Glad to see by how much I was underestimating the progress.

Looks like there’s a fairly active Tree Sitter implementation for Zig, though it’s sorely missing installation instructions.