I’m trying to create symbols for a Rust extension, and for anyone unfamiliar Rust defines struct methods (and enum methods, cuz Rust be wildin’) in an implementation block outside the scope of the struct definition. It looks like this:
My attempt at nesting methods kind of worked with the start-next-end context. I set the struct definition as behavior="start" and the implementation block as behavior="next", set group-by-name="true", and didn’t set a symbol type for the implementation block (if I did, then it would create a nested struct with the same name with nested methods). There are a few issues I can’t resolve:
Because there can be multiple impl blocks per struct/enum, I can’t set an end for the behavior. I couldn’t seem to end the symbol, so anything defined afterwards was also nested under the struct symbol. Setting the unclosed attribute had no effect.
The Rust compiler allows structs to be defined after struct methods. In this case, the methods aren’t nested below the struct.
Due to the above, I’m not sure if the structs are actually being combined, or if it just looks like it because the untyped symbol doesn’t nest things, and then the start symbol nests everything below it.
Relevant start-next-end syntax definitions for structs
It seems that the reason things are being collected recursively is that your function-blocks collection includes the entire grammar recursively (using syntax="self"). While the parse tree is likely being constructed properly, using a “start” and “next” symbol behavior without an explicit “end” somewhere is likely not designed to do what you intend.
Truncating symbolic ranges with the “truncate” option is only applied at the end of a document when there are no more symbols to apply, not when a new “start” context is encountered. By then, each Rust struct has already been embedded within its (incorrect) “parent,” hence why your screenshot shows it recursively nesting.
My suggestion would be to symbolicate the ending bracket } in some way as an "end" marker. It should automatically end whatever container is active (be it struct or method). Another possibility might be to symbolicate the struct and impl separately.
Unfortunately, this is a consequence of how Rust separates the struct definition and impl definitions, as the parser wasn’t designed to handle this behavior (as we haven’t seen a counterpart in other languages so far.) I can investigate if change to this in the future might be helpful!
Thank you so much for looking into this and for your detailed write-up!
I mentioned above that I don’t think I can designate an "end" marker because the syntax allows multiple impl blocks for structs. The answer for now seems to be to not try and nest methods under structs, so I appreciate the help!
I don’t think Rust is entirely unique in attaching methods to structs outside of the struct definition. I can at least point to Go as an example where methods are attached to structs by defining a “receiver” for the function (for reference). I would agree that this approach is uncommon.
For @logan or anyone else constructing symbol queries with Tree Sitter – does Tree Sitter enable anything new as far as combining symbol definitions? Am I correct that the scope.groupByName setting only applies to queries using @start or @end captures?
It would be great if scope.groupByName could “merge” symbols with the same name with a @subtree scope. The @start and @end captures don’t seem to work well with impl blocks – these blocks often are not contiguous and there can be several of them. I think it would be a lot cleaner and more understandable and navigable to have all methods nested under a single struct symbol, rather than symbols for each impl block.