Go To Definition in the Language Server Protocol

Introduction

Go To Definition in the Language Server Protocol is one of the features that distinguish an editor specialized for writing code from a generic editor. This is the editor command that allows us to quickly find where a symbol is defined elsewhere in the code. For example, when we encounter a function call, we can click on the function’s name and the editor will teleport us to where the function is defined. This command is associated with the Ctrl/Cmd+Click combination in most editors, and with the F12 keyboard shortcut in Visual Studio Code.

An application or library can be made of many files, and some of them can span hundreds or even thousands of lines. So, a way to navigate inside such complexity makes it much more approachable. And, having it available with a mouse click helps to avoid distractions that break your flow of thoughts.

This tutorial continues our series on the Language Server Protocol (LSP). This is an open specification by Microsoft that allows exposing services for programming languages and DSLs so that different editors supporting the protocol can consume them. By using the LSP, with moderate effort and a shared codebase we can reach a good level of tool support for our language on several editors and IDEs, including Microsoft’s Visual Studio Code, Eclipse Theia, and IntelliJ IDEA (through a third-party plugin).

As in our previous article on code completion, we’ll build a working example integrated into Visual Studio Code. In turn, this is based on the code we developed for the article, Code Completion with antlr4-c3.

It’s not necessary to know about code completion to follow this tutorial, or even VSCode and the LSP for that matter. The idea of Go To Definition is not specific to a single editor, and the techniques presented here are applicable to other editor technologies. We’ll also be making some general considerations about this feature. That said, we’ll be reusing some concepts and code from previous articles, and we’ll frequently include links to them.

We’re sharing the full source code for this tutorial on GitHub.

Go To Definition, Declaration, Implementation, …

Actually, the Language Server Protocol supports several “go to …” commands. Language implementers may want to provide more than one to match the semantics of their language. For example, in the C and C++ languages, a function may be declared in a header file, but defined in another .c source file. Or, in many object-oriented languages such as Java and Kotlin, a method may be defined in an interface, trait, abstract class, or similar, and implemented in several derived classes. C# even allows “partial classes”, that is, classes defined over multiple files.

In this tutorial, we’ve chosen to only implement Go To Definition, but the same principles apply to the other variants.

Setup

Instead of starting from scratch and repeating all the boilerplate, we’ll expand the project that we’ve developed in the previous article in this series. In fact, it includes some parts that we’re going to reuse and extend; in particular, the symbol table, which already deals with (simple) import statements. That way, we’ll be able to demonstrate Go To Definition across different files, with little extra effort.

So, to start, we can clone the repository containing the code for our code completion tutorial:

git clone https://github.com/Strumenta/article-antlr4-c3-vscode

The v1.0 tag refers to the code released when the article was published, with minor improvements. We may revise the code later, to fix any issues that we may find.

Building

This is a VSCode extension developed in TypeScript; as a bare minimum, we’ll need a node.js installation to start. Node.js includes the npm package manager that we use to download all the other dependencies, including the TypeScript compiler. Or, we can use yarn or other tools instead of npm, but if you know about those, you probably already know how to use them.

So, to download the dependencies we’ll invoke the command:

npm install

Then, to build the project, we’ll issue:

npm run-script compile

If we installed everything properly, we should receive no errors.

The project

The extension that we just cloned includes a language server implementation for a subset of Kotlin. In particular, of all language server features, it only supports code completion (and only for variables and keywords).

Our project instructs VSCode to activate the language server for files with the .mykt extension, to avoid clashing with the official Kotlin extension that you may already have installed. We configure this in the file package.json.

For more information about the layout of the project, please refer to the article, Integrating Code Completion in Visual Studio Code. In this tutorial, we’ll add a new method to support Go To Definition, and apply some refactoring, but apart from that, we’ll keep the structure of the project intact.

Running the Extension

We’ve described how to run the extension during development in our previous tutorial, but we’ll repeat it here for convenience.

The project that we’ve cloned already includes the configuration that VSCode needs in order to run it. In fact, every extension needs a host, that is, an instance of VSCode that will provide the right environment to consume the extension’s services – in our case, by using the Language Server Protocol.

To run the extension we first need to build it. This should happen automatically after each modification to the source code, but if it doesn’t for some reason, we can trigger it manually with Ctrl/Cmd+Shift+B. Then, if everything goes fine, we can switch to the Run panel (Ctrl/Cmd+Shift+D), choose “Client” from the dropdown at the top of the window, and click on the green arrow button. This will run our extension in a fresh instance of VSCode.

If we want to debug the server, we choose “Client + Server” instead. This launches the client, which in turn starts the server in a separate process; then, VSCode attaches a debugger to the server. In any case, we can always start a debugging session at a later time, choosing “Attach to server” and pressing the green arrow after having launched the client.

Enabling Go To Definition

Now that we’ve set the project up, we can add the boilerplate that will enable the Go To Definition feature.

First, in the file, server.ts, where the language server proper is implemented, there’s a section where we declare the capabilities of our server; that is, which features of the Language Server Protocol it implements, and with which options. There, we’ll add a new capability, definitionProvider:

const result: InitializeResult = {
    capabilities: {
        textDocumentSync: TextDocumentSyncKind.Incremental,
        // Tell the client that this server supports code completion.
        completionProvider: {
            resolveProvider: true
        },
        // Tell the client that this server supports go to definition.
        definitionProvider: true
    }
};

This will tell the client (i.e., VSCode, in our case) that it can ask for definitions. However, the server does not yet know how to respond to such requests, and the client’s calls will fail with an error.

However, we can provide a new callback method:

connection.onDefinition((params) => {
    return undefined;
});

Returning undefined tells the client that we don’t know any definition of the given symbol. The user won’t be sent to the definition, but at least, the client’s requests won’t fail. This is, in fact, the absolute minimum level of “support” for Go To Definition that we start from.

Similarly, there’s a declarationProvider capability and an onDeclaration handler, and so on.

Recording the Location of Definitions in the Symbol Table

Anyway, we want to return something a bit more interesting than undefined. Let’s start by inspecting the type of the callback function that we’ve registered with the onDefinition request handler. We can see that it’s a complex type that includes information about error handling and progress reporting.

Let’s concentrate just on successful invocations of the callback. In this case, we must return either undefined (as we’ve done earlier), a Location object, an array of Location objects, or an array of LocationLink objects.

So, in practice, we’ve got two types to choose from: Location and LocationLink. Conceptually, a LocationLink is an extended Location with more information, although it’s not strictly a derived type, for whatever reason. Also, the language server protocol documentation says “Servers should prefer returning DefinitionLink over Definition if supported by the client” and our client, VSCode, certainly supports it.

What about the array variants? Although, in theory, we can answer with multiple locations for a symbol (e.g., for an overloaded method in languages that support those), we’ll always return at most one location as our mini Kotlin language server only tracks the definitions of functions and variables. So, that rules out the array variants.

Going a bit deeper with the details, we can see that a LocationLink comprises the following information:

targetUri: the location of the source code containing the reference of the link. Note that URIs can point to files on the local machine, they’re not reserved for network resources.
targetRange: the span of the full source of the link’s reference, for example, the entire function definition when the user is clicking on a function name.
targetSelectionRange: the range that should be selected and revealed when the link is followed; in the case above, the name of the function. This range should be contained inside targetRange.
originSelectionRange: this optional property tells the client which portion of the source text is the actual link, e.g., a function name. If not specified, the client will apply some generic logic to identify a “word” that is clickable.

Armed with this knowledge, we can now proceed to associate a LocationLink to each definition in the code, and to track the scope and uses of each function and variable – the only two program elements that we’ve decided to handle in our limited example.

Recording Locations in a Single File

Let’s start with the more constrained task of recording the location of the definitions in a single source file. This won’t change significantly when we’ll include other files in the picture, but we can ignore that concern altogether for now.

Recall that in previous tutorials we’ve filled a symbol table using a parse tree visitor. The symbol table itself is part of the antlr4-c3 library that we’ve introduced to implement code completion. In the symbol table, we’ve recorded symbols that represent definitions of functions and variables; such symbols also double as scopes when it makes sense, i.e., just for function definitions in our limited example. So, we’ll record each function definition as a RoutineSymbol (routine being a generalization of function, method, and other similar concepts), and that symbol will also represent the lexical scope that is in effect in the function’s body.

Now, we need to associate a location for each definition. So, we’ll start by extending the symbol table:

import {SymbolTableVisitor as BaseVisitor} from "toy-kotlin-language-server";

export class SymbolTableVisitor extends BaseVisitor {}

We’ll then add a documentUri field to track the location of the source code that we’re parsing:

constructor(public documentUri: DocumentUri,
            public symbolTable = new SymbolTable("", {}),
            scope = symbolTable.addNewSymbolOfType(ScopedSymbol, undefined)) {
    super(symbolTable, scope);
}

Due to TypeScript’s inheritance rules, we have to declare again the symbol table (that we now expose as a public field) and the symbol that represents the global scope.

Recording an Individual Location

To record the location of a definition, we can exploit the dynamic nature of JavaScript to add a location field to every symbol that represents a definition:

protected registerDefinition(symbol: any, tree: ParseTree, definitionName: ParseTree) {
    symbol.location = LocationLink.create(this.documentUri, getRange(tree), getRange(definitionName));
}

We’ll use registerDefinition like this:

visitVariableDeclaration = (ctx: VariableDeclarationContext) => {
    let symbol = this.symbolTable.addNewSymbolOfType(VariableSymbol, this.scope, ctx.simpleIdentifier().text);
    this.registerDefinition(symbol, ctx, ctx.simpleIdentifier());
    return this.visitChildren(ctx);
};

We provide the symbol, the parse tree representing the entire definition, and the parse tree node representing just the name (of the variable, in this case). From each parse tree, we’ll construct the range of the corresponding portion of the source code:

export function getRange(parseTree: ParseTree) {
    let start, stop;
    if(parseTree instanceof ParserRuleContext) {
        start = parseTree.start;
        stop = parseTree.stop;
    } else if(parseTree instanceof TerminalNode) {
        start = stop = parseTree.symbol;
    }
    let endCharacter = stop.charPositionInLine + stop.text.length;
    return {
        start: { line: start.line - 1, character: start.charPositionInLine },
        end: {   line: stop.line - 1,  character: endCharacter }
    };
}

We have to differentiate between parser rules, that have a start token and a stop token, and terminal nodes, that have a single token called the symbol. The charPositionInLine field tracks the start position of a token in the line. This code assumes that tokens do not span multiple lines, which is true in practically every grammar for typical programming languages and DSLs.

Recording Locations in Different Files

To extend our recording of definitions across different files, we can leverage the work that we’ve done in previous tutorials to handle import statements.

A quick recap: previously, we implemented a simplified import mechanism that only knows about files in the same directory. import foo will parse and analyze the file, foo.mykt, if present in the directory of the importing file. This was enough to showcase code completion across different files, and it’s enough to showcase Go To Definition, too. By severely restricting the search space of imports, we also avoid the need to index third-party libraries – a task that every IDE has to perform, as we’ll see in more detail in a later section.
So, to the point where we arrived with the last tutorial, if a Kotlin file has import statements, we parse the corresponding files one by one, and populate the symbol table:

let imports = parseTree?.preamble()?.importList()?.importHeader();
if(imports) {
    processImports(imports, symbolTableVisitor);
}
//...
function processImports(imports: ImportHeaderContext[], uri: string, symbolTableVisitor: SymbolTableVisitor) {
    let basePath = computeBasePath(uri);
    for(let i in imports) {
        const filename = imports[i].identifier().text + ".mykt";
        const filepath = basePath + filename;
        if (fs.existsSync(filepath)) {
            processImport(filepath, symbolTableVisitor);
        } else {
            connection.window.showErrorMessage("Imported file not found: " + filepath);
        }
    }
}

function processImport(path: string, symbolTableVisitor: SymbolTableVisitor) {
    try {
        let data = fs.readFileSync(path);
        let input = CharStreams.fromString(data.toString());
        let lexer = new KotlinLexer(input);
        let parser = new KotlinParser(new CommonTokenStream(lexer));

        let parseTree = parser.kotlinFile();
        symbolTableVisitor.visit(parseTree);
    } catch (e) {
        connection.window.showErrorMessage("Cannot read from imported file " + path + ": " + e);
        console.error(e);
    }
}

Now, for our purpose, we also need to associate the definitions in each imported file to their correct location. Recall that our symbol table has a documentUri field that we use to track the currently parsed document; we just have to update that for each imported file:

function processImports(imports: ImportHeaderContext[], symbolTableVisitor: SymbolTableVisitor) {
    let uri = symbolTableVisitor.documentUri;
    let baseUri = computeBaseUri(uri);
    let basePath = ensurePath(baseUri);
    for(let i in imports) {
        const filename = imports[i].identifier().text + ".mykt";
        const filepath = basePath + filename;
        if (fs.existsSync(filepath)) {
            symbolTableVisitor.documentUri = baseUri + filename;
            processImport(filepath, symbolTableVisitor);
        } else {
            connection.window.showErrorMessage("Imported file not found: " + filepath);
        }
    }
    symbolTableVisitor.documentUri = uri;
}

We restore the original documentUri after we’re done, as after processing imports, we’ll analyze the main file.

We’re not showing the definitions of helper functions such as computeBaseUri for brevity. You can find them in the source code accompanying this tutorial.

Returning the Location of a Definition

Armed with all the infrastructure we’ve built so far, we can now properly implement our onDefinition callback.

First of all, we need to parse the source code and compute the caret position:

let uri = params.textDocument.uri;
let document = documents.get(uri);
let input = CharStreams.fromString(document.getText());
let lexer = new KotlinLexer(input);
let parser = new KotlinParser(new CommonTokenStream(lexer));
let parseTree = parser.kotlinFile();

let pos = params.position;
let position = computeTokenPosition(parseTree, parser.inputStream,
  { line: pos.line + 1, column: pos.character });

For more details, please refer to our previous tutorial.

Then we can build our symbol table:

let symbolTableVisitor = new SymbolTableVisitor(document.uri);
symbolTableVisitor.visit(parseTree);

Now that we have everything we need, we’ll compute and return the location of the definition, if we can find it:

if(position && position.context) {
    const scope = getScope(position.context, visitor.symbolTable);
    let definition = findDefinition(position.context.text, scope);
    if(definition && definition.location) {
        return definition.location;
    }
}

If we don’t find a definition, we return undefined, as before.

We could guard the type of the node position.context to only allow for identifiers and discard all other nodes. However, that would complicate the code with little to no benefit except better performance (because we wouldn’t search for the scope and definition of a node that cannot possibly refer to any definition).

Note that we’ve reused the getScope function that we’ve defined in a previous tutorial. This function returns the closest lexically enclosing scope of a given node in the parse tree. In our case, the node corresponds to the text under the user’s caret, if using only the keyboard; or, to the text being the target of a mouse click. It may or may not represent an identifier. Only if it does we may find and return a definition’s location. If it’s not an identifier, we’ll never find anything and we’ll always return undefined.

Sample code annotated with scopes:

(blue)
val inGlobalScope = 42

(yellow)
fun outer() {
val inOuterScope = "outer"
(red)
fun inner() {
val inInnerScope = Object()
}
} — The red scope is represented by the *inner* symbol, the yellow one is represented by the *outer* symbol, and the bluish one is the global scope.

Computing the Location

The actual work to compute the location happens in the findDefinition function whose implementation we’ve previously omitted. Let’s now have a look at it:

export function findDefinition(name: string, scope: Symbol) {
   while(scope && !(scope instanceof ScopedSymbol)) {
       scope = scope.parent;
   }
   if(!scope) {
       return undefined;
   }
   let symbol = (scope as ScopedSymbol).getSymbolsOfType(Symbol).find(s => s.name == name);
   if(symbol && symbol.hasOwnProperty("location")) {
       return symbol;
   } else {
       return findDefinition(name, scope.parent);
   }
}

We pass in the name of the definition and the scope from where we start the search. This is, initially, the context associated with the given position in the document, as we’ve seen in the previous tutorial. Note that, given how we construct our symbol table, we always have at least one scope: the symbol representing the global scope. Also, we guaranteed to associate every identifier (of the supported types) with some scope.

So, we recursively walk the scope chain until we find the scope that contains a symbol of the given name. We’re looking for symbols of any type; the “any symbol type” is represented by the Symbol class, which is the superclass of all symbol types. We also check if a candidate symbol has a location property. In that case, we’ve found a definition and we return it.

If we arrive at the global scope and don’t find anything, we return undefined. If, however, we do find a scope with the appropriate definition, it’ll have the location attached to it, thanks to the work we’ve done on the symbol table.

Enriching the LocationLink

The location object that we’re returning is one of those the we computed in the symbol table with the code:

symbol.location = LocationLink.create(this.documentUri, getRange(tree), getRange(definitionName));

However, LocationLink has another optional field that we haven’t set: originSelectionRange. Recall that this represents the portion of the source code that can be clicked to go to the definition; as such, we couldn’t set it in the symbol table. However, we can compute it now.

Indeed, the actual code to return the complete location is:

if(definition && definition.location) {
    return {...definition.location, originSelectionRange: getRange(position.context) };
}

Here, we enrich the original location by adding the originSelectionRange. We compute it with the getRange function that we’ve seen above. Its input is the context associated with the position in the source code where the user asked for a definition.

Note that we don’t modify the original location; rather, we create a new object, that includes all the fields of the location stored in the symbol table, using the JavaScript object spread operator.

We should also note that the editor uses the originSelectionRange to highlight a portion of the source code. Therefore, it actually calls onDefinition before the user even clicks on the link! So, we should answer those requests as quickly as possible; users don’t expect highlighting a piece of code to take a noticeable amount of time.

Improving Performance

This brings us to the next topic – performance. As usual, when talking about performance, we should only act if there’s an actual problem that impacts user experience, and only after we’ve profiled our program to identify bottlenecks.

Here, we’ll just offer an easy solution to tackle the most common performance bottleneck in this kind of extension: parsing. In fact, the way that we’ve structured our solution so far, we’re parsing the full source code and filling the symbol table, every time the user asks for code completion or for a definition. These are not the best moments to perform a time-consuming operation such as parsing. That’s because the user is not modifying the source – they’re focused on getting an answer and they want it right away.

So, we can improve the situation by employing a cache, the most common expedient for getting better performance, trading memory for speed. In particular, we need to cache three objects:

the parser
the parse tree
the symbol table, filled with definitions

We use all of the three objects when we send the user to some definition, as we’ve seen above.

The most natural place where we can cache our three objects is inside a TextDocument. This object represents a file that is open in the editor. Thanks to the dynamic nature of JavaScript, we can attach additional information text documents, adding new fields to them. Also, the text document manager that we already have in place (see our previous tutorial in the series) ensures that they’re garbage-collected when they’re no longer in use. That way, we don’t have to worry about purging orphan entries from the cache.
So, as a first step, we can collect parsing and filling the symbol table in a new method, ensureParsed, that will also store the objects inside the document:

function ensureParsed(document: TextDocument) {
    if(document["parser"]) {
        return { parser: document["parser"], parseTree: document["parseTree"], visitor: document["symbolTableVisitor"] };
    }
    const input = CharStreams.fromString(document.getText());
    const lexer = new KotlinLexer(input);
    const parser = new KotlinParser(new CommonTokenStream(lexer));
    const parseTree = parser.kotlinFile();
    const symbolTableVisitor = new SymbolTableVisitor(document.uri);

    const imports = parseTree?.preamble()?.importList()?.importHeader();
    if(imports) {
        processImports(imports, symbolTableVisitor);
    }
    symbolTableVisitor.visit(parseTree);

    document["parser"] = parser;
    document["parseTree"] = parseTree;
    document["symbolTableVisitor"] = symbolTableVisitor;
    return {parser, parseTree, visitor: symbolTableVisitor};
}

There’s nothing really new under the sun, we’ve done little more than refactoring some of our code. A part of the caching happens in the first lines of the method; when existing objects are returned and not recomputed if they are associated with the document – a cache hit. The other part happens in the last few lines, where, following a cache miss, the objects are computed, and then stored in the document.

Now, where we used to do the parsing etc. before, we’ll have a single line of code:

const {parser, parseTree, visitor} = ensureParsed(document);

This will either perform the parsing or return the cached objects, as we’ve seen. However, now we’re doing too much caching. We’re never marking the cached objects as stale, so no matter how many times the user changes a document, we’ll keep referring to the first version of it that we’ve parsed!

So, we clearly need a method to mark a document for reparsing. We’ll do that by simply removing the three objects associated with it:

function markForReparsing(document: TextDocument) {
    document["parser"] = undefined;
    document["parseTree"] = undefined;
    document["symbolTableVisitor"] = undefined;
}

Now that we have markForReparsing, we’ll invoke it whenever a document’s contents change due to some modification by the user. To do that, we can listen to the appropriate event on the text document manager:

documents.onDidChangeContent(change => {
    markForReparsing(change.document);
});

We may also want to immediately reparse the document just after each modification; in general, that will be necessary anyway because we’ll want to report errors and warnings to the user as they type:

documents.onDidChangeContent(change => {
    markForReparsing(change.document);
    ensureParsed(change.document);
});

Note that “just after each modification” may also mean asynchronously, which would give us a chance to cancel parse jobs that have been rendered unnecessary by subsequent modifications by the user. However, we’re not getting that advanced here.

Going even further, IDEs often employ incremental parsing techniques to reduce the amount of code parsed after each modification. In fact, typically a user only modifies a small portion of the source text in a file at a time, while the rest of the file stays untouched. Reparsing the whole file is a waste, especially if it’s big.

While there are some projects that explore that possibility for ANTLR parsers and lexers, this is out of the scope of this tutorial.

Individual Files or Project? Editor vs IDE

The issue of caching opens up a more general topic that is not related only to performance: indexing. The language server protocol is generally oriented towards requests that involve individual documents, i.e., source files. It has some notion of workspace folders, but it has no concept of a “project” that includes dependencies.

This is one of the key differences between an editor such as VSCode, which is targeted primarily at editing individual files in simpler projects, and a full-blown IDE, which has a stronger concept of a project (such as an NPM package, a Maven-based Java project, and so on). In an IDE, typically you don’t open a directory; rather, you import a project. The IDE then analyzes and indexes (i.e., caches) all the source files, generated files, libraries, and so on.

LSP-based solutions will need to do something similar if they want to offer comprehensive language services. For example, in a real-world implementation of a Kotlin language server, we’ll want to resolve import statements that refer to libraries that may be declared e.g. in a pom.xml file (Maven) or build.gradle file (Gradle). And, of course, we’ll want Go To Definition to show the source of those types and their members as well, if available.

However, the LSP itself offers little support in that regard. This is an area where we, as the language engineering community, could make an effort towards finding a general solution. So far, each language server does (or doesn’t) this differently.

Conclusions and Further Reading

Go To Definition is an important part of what makes a code editor useful. In this tutorial, we’ve enhanced a VSCode extension based on the language server protocol with this feature. You can find all the code on our GitHub repository.

That said, the user experience can be further improved with other Language Server Protocol features. We’ve already talked about the various Go To … variants that the LSP supports, but there’s more. For example, we may want to display documentation or other information when the user hovers with the mouse over some identifier (Hover Request). Or we may present function/method signature information (Signature Help Request), et cetera.

All these features can reuse part of what we’ve done for Go To Definition, as the general pattern is the same: store some information about definitions in the symbol table, then retrieve it when requested and return it to the client. However, they also require enough specific logic that going into the details here would have been too much.

To discover more about how to write a python parser, you can read Parsing In Python: Tools And Libraries

To understand how to use ANTLR, you can read The ANTLR Mega Tutorial

To discover more about parsing in Javascript, you can read Parsing in JavaScript: Tools and Libraries

For a full understanding of how parsers works, see A Guide to Parsing: Algorithms and Terminology

To discover more about parsing in Java, you can read Parsing in Java: Tools and Libraries

To discover more about parsing in C#, you can read Parsing in C#: Tools and Libraries

To discover more about parsing SQL, you can read Parsing SLQ