-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interact with LLM in JSON #743
Comments
@DonggeLiu Hi can you assign this issue to me |
Hi @nugget-cloud Thanks for willing to help. #744 could be a good starting point for you : ) |
Hi, @DonggeLiu
Please correct me if I am wrong here |
Thanks @nugget-cloud! Before we dive in, I should mention that this is exploratory work, meaning it may improve results or lead to lower performance. Potential challenges include handling quoted/unquoted strings in JSON, confusing LLM somehow, or unexpected compatibility issues. Unfortunately, if the outcome turns out to be worse than the current approach (e.g., a lower fuzz target build rate), we won’t be able to merge the changes. If you are keen, I suggest breaking this task into stages, starting with a simpler change before moving to more complex ones. Stage 1: Converting XML prompts to JSONThe goal here is to improve clarity in our requests to the LLM by switching from XML to JSON, which is a more commonly used format. The core logic remains the same. TODOs:
We can stop here if Stage 2 turns out to be too complex—your help is greatly appreciated regardless! Stage 2: Structuring LLM Responses in JSONThis stage requires deeper modifications, ensuring the LLM directly returns structured JSON responses rather than plain text. TODOs:
If done correctly, we shouldn't need major changes to output_parser.py, as the responses should already be well-formatted. Minor adjustments, such as handling unquoted text, may be required. Once this is complete, you will have made significant progress, and we can confidently close this issue. Stage 3 (Optional): Encapsulating LLM Interactions as Function CallsIf you're interested in exploring this further, we could encapsulate LLM interactions as function calls. The idea is to explicitly define how the LLM's responses will be used by a function and request it to return structured parameter values accordingly. For example, instead of directly asking the LLM to generate a fuzz target, we define a function that takes a fuzz target as a parameter and instruct the LLM to generate the appropriate parameter value. In practice, this is more complex, but much of the groundwork has already been done in #731, which I put on hold due to other priorities. Feel free to pick it up if you're interested! |
Currently we use XML-tag, it would be nice to use JSON as that seems to be more common, e.g., in Tool API.
One potential issue is escaping special characters. We have seen cases where LLM tend to add redundant
\
s, which may have performance impact.The text was updated successfully, but these errors were encountered: