Local LLM Software Compatible With AMD & NVIDIA GPUs List (2024)

With the rapid development of new software used for large language models self-hosting and local LLM inference, the support for AMD graphics cards is more and more widespread. Here is a list of all the most popular LLM software that is compatible with both NVIDIA and AMD GPUs, alongside with a lot of additional information you might find useful if you’re just starting out. I will do my best to update this list frequently, so that it continues being a useful and reliable resource for as long as possible. Last article update: November 2024

If you’re looking for the very best AMD graphics cards you can get for local AI inference using the LLM software presented on this list, I’ve already put together a neat resource for picking up the very best GPU model for your needs. Feel free to take a look!

Base Windows/Linux Requirements For AMD GPU Compatibility – ROCm

AMD ROCm software stack for AMD GPUs on Linux and Windows.
AMD ROCm software stack, still in active development, is one of the things you’ll need if you want to efficiently run your large language models using a new AMD graphics card.

One of the things you need to make use of AMD graphics cards in the most efficient way (especially the newest models) regardless on which operating system you’re on is ROCm, which is an open-source software stack for GPU accelerated computing much like CUDA form NVIDIA.

As of now, the ROCm versions for Windows are still a bit behind ROCm releases for Linux, and are generally missing some of the Linux’s version components. Because of this, not all applications that make use of ROCm features might work both on Windows and Linux in the exact same way.

Furthermore, for now, only the newest AMD graphics cards get the full ROCm support, while the older ones either don’t support it at all, or do, but with limited features. While on Linux, you can circumvent the lack of support for AMD GPUs that do not fully support ROCm (like the Radeon RX 67xx/66xx series for example) by modifying the HSA_OVERRIDE_GFX_VERSION environment variable, on Windows this is not possible without some lengthy and rather complicated workarounds.

On Windows, to install ROCm you have to install the AMD HIP SDK on your system, and on Linux, you can do this using the official install scripts available here on the official AMD ROCm website.

If installing or using ROCm isn’t an option for you for one reason or another, when it comes to most software on this list you can also utilize the Vulkan graphics API, or the OpenCL framework, which while generally slower, will still allow you to use your AMD GPU for model inference.

Unsupported AMD GPUs – HSA_OVERRIDE_GFX_VERSION

As it was already said, if you own an older AMD GPU which doesn’t officially support ROCm fully, you might still be able to benefit from the ROCm GPU acceleration. Modifying the HSA_OVERRIDE_GFX_VERSION parameter can override the Graphics Core Next (GCN) version that the ROCm libraries utilize, effectively enabling the use of certain features or software on unsupported hardware.

Once again, mind that for now, modifying this variable will only be possible on Linux, but not on Windows. Some workarounds for this issue are out there, but they can be pretty hard to implement if you’re not a tinkerer at heart. Still, as I’ve already mentioned, AMD graphics cards can also utilize some of the other software components for GPU acceleration, such as OpenCL or the Vulkan graphics API, which most of the software mentioned in this list supports.

LLM Software Full Compatibility List – NVIDIA & AMD GPUs

Here is the full list of the most popular local LLM software that currently works with both NVIDIA and AMD GPUs. Mind that some of the programs here might require a bit of tinkering to get working with your AMD graphics card. Refer to the table below and use the project links to check out further details on the official websites and repositories for the listed software. All of the listed programs do support both ROCm and Vulkan, unless stated otherwise.

If you’re on mobile, scroll right on the table to see all the columns.

Software GUI/WebUI Available OS Compatibility AMD Support Official Docker Container Project Links
LM Studio Yes Windows, Linux, macOS Yes Not available Project Site
GitHub Page
Gpt4All Yes Windows, Linux, macOS Yes Not available Project Site
GitHub Page
KoboldCPP Yes Windows, Linux, macOS Yes Not Available GitHub Page (Main)
GitHub Page (ROCm)
MLC LLM Yes Windows, Linux, macOS, iOS, Android* Yes Not Available Project Site
GitHub Page
Llama.cpp No Windows, Linux, macOS Yes Available GitHub Page
Ollama No Windows, Linux, macOS Yes Available Project Site
GitHub Page
AnythingLLM Yes Windows, Linux, macOS Yes Available Project Site
GitHub Page
Oobabooga Yes Windows, Linux, macOS Yes Available GitHub Page
Jan Yes Windows, Linux, macOS Yes, only via Vulkan Available Project Site
GitHub Page
Llamafile(s) No Windows, Linux, macOS Yes Available** Project Documentation
GitHub Page

*Android and iOS are listed only if the software features an official Android/iOS application. Some of the programs here (for instance Llama.cpp and Ollama), can, with some additional effort be run on Android devices, for instance by utilizing Termux.

**You can easily containerize Llamafiles using the following method.

If you want to know which one of these you should choose if you have no prior experience with this kind of software at all, or are simply curious how simple the installation process is with each of these and want some more additional details, the next section of this article should take care about that.

So, Which One of These Is The Easiest to Set Up & Install?

To answer this question, below you can see a quick summary of all the software from the table above with additional comments, mainly on the simplicity level of the installation process. This way you can have a better idea of which solutions you might want to try out first if you don’t want to waste much time on the setup phase. This information will be updated alongside with the table above.

  1. LM StudioExtremely easy to set up on AMD systems as well as for NVIDIA users, one of the best options for beginners. Features one-click installers for Windows, MacOS and Linux. Unlike the other pieces of software on this list though, it is closed source.
  2. Gpt4All – Just as with LM Studio, there are simple installers available for both Windows, MacOS and Linux.
  3. KoboldCPP – Alongside its ROCm compatible fork, it has a one-click installer available for Windows and a simple installation script for Linux.
  4. MLC LLM – You can easily install it on your system using Python pip with a single console command both on Windows, iOS and Linux. Installing it in an isolated conda virtual environment is highly recommended. Having native Android and iOS mobile apps available for download is one of the strongest points of this software.
  5. Llama.cpp – On Windows, it has pre-compiled binary files available to unpack and install, on Linux you can install llama.cpp using either brew, flox or nix.
  6. Ollama – Simple installers are available for Windows and MacOS, and an installation script for Linux. Has an official list of supported AMD cards.
  7. AnythingLLM – For Windows and MacOS, AnythingLLM features a one-click installer which should automatically download all the necessary dependencies required for the program to utilize GPU acceleration methods appropriate for your system. The same goes for Linux with the simple installation script.
  8. Oobabooga WebUI – On Windows, requires a few additional installation steps to be compatible with AMD graphics cards. On Linux, ensuring AMD compatibility requires manually installing a few additional software packages.
  9. Jan – Installers available for Windows, MacOS and Linux, Vulkan acceleration support for AMD GPUs needs to be enabled in software settings.
  10. Llamafile – Llamafiles can be used right away after downloading them and typically do not require any additional setup beforehand.

Are NVIDIA Cards any Better When It Comes To Local LLMs?

Graphics cards used for AI model inference and training.
NVIDIA graphics cards still have an upper hand on AMD hardware when it comes to local LLM inference.

The truth is, that while AMD is still a little bit behind when it comes to AI software compatibility particularly because of slower developments of ROCm in comparison to NVIDIA’s CUDA technology, many of their newer GPU models are more than enough for most tasks related to local large language model hosting.

In reality, when using AMD graphics cards with AMD GPU compatible LLM software, you might sometimes come across either a little bit more complex setup (not necessarily always, as you’ve already seen with some of the examples above), and a tad bit slower inference (which will likely become less of an issue as the ROCm’s framework evolves).

AMD graphics cards are currently way less expensive than the ones manufactured by NVIDIA, especially when you take into account the price per 1GB of VRAM, which as you might already know is what you can never get enough of when it comes to locally hosting larger, higher quality large language models.

Taking all this into account, I can safely say this: AMD cards are in general, as time goes by, becoming more and more reliable when it comes to handling all sorts of AI-related tasks, and their prices are still much lower than the prices of NVIDIA cards, even when it comes to the latest models available on the market.

If you can live with the fact that in certain contexts you might be facing either a little bit more complex software configuration, older cards not being compatible with some of the programs you use, or other compatibility issues along the way, in my honest opinion you can safely stick with AMD, or choose Radeon cards for your new LLM hosting setup. For now, however, the price difference may at times come at the cost of considerable time spent troubleshooting, so keep that in mind. Well, let’s see what the future holds!

Tom Smigla
Tom Smiglahttps://techtactician.com/
Tom is the founder of TechTactician.com with years of experience as a professional hardware and software reviewer. Armed with a master’s degree in Cultural Studies / Cyberculture & Media, he created the "Consumer Usability Benchmark Methodology" to ensure all the content he produces is practical and real-world focused.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Check out also:

Latest Articles