Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vision): add support for proper file and data image URLs #727

Merged
merged 3 commits into from
Sep 1, 2024

Conversation

Schuwi
Copy link
Contributor

@Schuwi Schuwi commented Aug 30, 2024

Currently when using vision models, sending image data as base64 in proper URL format (i.e. "url": "data:image/png;base64,iVBORw0K..." in JSON request) fails with a 500 error response on the API. I encountered this problem when using LibreChat with mistral.rs.

This PR fixes this issue by implementing proper URL parsing.

Backwards-compatibility is maintained by parsing naked file paths (e.g. "url": "some/file/path.png") and naked base64 input (e.g. "url": "iVBORw0K...") the same as before.

The new parser logic uses the url and data-url crates.
Unit tests have also been added.

Copy link

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                   11          102          101            0            1
 Python                 46         2018         1718           62          238
 TOML                   20          616          543           11           62
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          196          169            1           26
 (Total)                            273          201           32           40
-------------------------------------------------------------------------------
 Markdown               29         2065            0         1570          495
 |- BASH                 5          101           98            0            3
 |- JSON                 1           12           12            0            0
 |- Python               5           92           82            0           10
 |- Rust                 6          408          365           19           24
 |- TOML                 2           75           63            0           12
 (Total)                           2753          620         1589          544
-------------------------------------------------------------------------------
 Rust                  198        62204        56501         1135         4568
 |- Markdown           102          946           13          881           52
 (Total)                          63150        56514         2016         4620
===============================================================================
 Total                 315        67537        59304         2780         5453
===============================================================================
  

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Schuwi! Thanks for the PR, it looks really great.

One thing I noticed is that there is no update to the code in mistralrs-pyo3, which also parses URLs (here). The only problem I see which prevents using the new parse_image_url function there is that the pyo3 code isn't async. Perhaps you could duplicate the function there but without async?

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@EricLBuehler EricLBuehler merged commit 82e4695 into EricLBuehler:master Sep 1, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants