Split long Markdown text into smaller segments and send them to the GitHub Copilot SDK for sequential processing to avoid "context deadline exceeded"

Table of Contents

    Using the free model of the GitHub Copilot SDK, I built a local service to translate Chinese to English, but when handling long texts it always fails to return a complete translation and throws an error.

    Error message

    When processing long texts, the GitHub Copilot SDK may throw an error:

    waiting for session.idle: context deadline exceeded

    It does not refuse to process, but after returning a certain length of content it stops returning further output. This typically occurs in scenarios that require returning long text, such as translating long documents or generating lengthy content.

    I’m not sure whether there is a limit on the number of returned tokens, a limit on the total request duration, or some other cause. I couldn’t find any related discussion in the official GitHub issues.

    Splitting long Markdown text

    To work around this issue, I tried splitting long Markdown text into smaller segments and sending them to the GitHub Copilot SDK one by one. This way each processed piece is short and should avoid the error.
    So far I’ve never encountered this problem when translating short titles.

    Because the texts to be translated are my own blog posts and don’t have complex structure—just normal Markdown—I split them by second-level headings. Each time I encounter a second-level heading, I treat it as a new segment and send a translation request. Finally, I concatenate the translated segments to form the complete translation. I tested it and the result was good:

    Connected to server. Waiting for tasks...
    Received task for Article ID: 1661
    Translating Title...
    Translating Content in segments...
    Translating Content Fragment 1/3...
    Translating Content Fragment 2/3...
    Translating Content Fragment 3/3...
    Result sent successfully.
    

    Third-party libraries for splitting Markdown

    The current approach of splitting by second-level headings does work for my own needs, but if a Markdown document has a more complex structure or longer sections, this method may not be suitable. For example, if a section contains third-level headings, many lists, or code blocks, splitting by second-level headings may not be fine-grained enough.

    I found a Go third-party library, langchaingo, which provides text-splitting functionality and can split according to Markdown structure:

    https://pkg.go.dev/github.com/tmc/langchaingo/textsplitter

    What is langchaingo?

    Building applications with LLMs through composability, with Go! This is the Go language implementation of LangChain.

    Bringing in langchaingo for such a simple need feels a bit heavy, so I’m not considering it for now.

    About the Author 🌱

    I am a developer from Yantai, Shandong, China. If you have any interesting topics or software development needs, feel free to email me at: zhongwei.sun2008@gmail.com for a chat, or follow my personal public account "Elephant Tools", See more contact information