How to install GPT-OSS on Windows and get the most out of it

Last update: 07/01/2026
Author Isaac
  • GPT-OSS allows the use of models of OpenAI locally on Windows, without depending on the cloud and with total privacy.
  • The gpt-oss-20b model is best suited for home PCs, requiring at least 16 GB of memory and a modern GPU for good performance.
  • Ollama and LM Studio simplify the installation and use of GPT-OSS by offering downloads guided tours and local chats in the style ChatGPT.
  • Once installed, GPT-OSS is useful for writing, studying, programming and document analysis directly from your computer.

Install GPT-OSS on Windows

If you've been using ChatGPT or other cloud-based AI for a while, you've probably wondered what would happen if you could have something similar. installed directly on your Windows PCWithout monthly fees, without depending on external servers, and without giving away a single conversation. That's precisely what OpenAI's new open models allow: gpt-oss-20b and gpt-oss-120b.

In the following lines we will see in great detail How to install GPT-OSS on WindowsWhat you need in terms of hardware, what the differences are between the two variants of the model, and how to use it with both Don't As with LM StudioIn addition, I'll tell you what it's useful for in everyday use, what you can expect in terms of performance on a standard PC, and what trade-offs you'll have to make if your computer is running on fumes.

What is GPT-OSS and what can it do for you?

gpt-oss

GPT-OSS is OpenAI's family of open source modelsDesigned so that anyone can download and run them on their own devices: computers, workstations, and even some powerful mobile phones. The name comes from Generative Pretrained Transformer – Open Source Series, that is, a series of generative text models of Open source under the Apache 2.0 license.

With GPT-OSS you can perform virtually the same tasks as with a cloud-based assistant: generate texts of all kinds (emails, social media posts, scripts, stories, poems…), summarizing long documents, rewriting paragraphs, improving writing, or adapting the tone to a more formal or more approachable style.

It is also capable of handling more technical tasks: Explain code, detect programming errorsIt can suggest better approaches in different languages, help you learn to program, or give you hints for solving logical and mathematical problems. It doesn't have direct internet access, but it handles step-by-step reasoning very well and can guide you through many complex processes.

In the area of ​​personal productivity, GPT-OSS works very well for organize projects, brainstormIt can help you create work plans, generate to-do lists, answer questions on a wide range of topics, or write reports and academic papers. If you're a student, freelancer, content creator, or developer, you can get a lot out of it running locally.

Advantages and disadvantages of using GPT-OSS locally on Windows

The big difference of GPT-OSS with respect to models like ChatGPT, Gemini or Claude is that You don't depend on a remote serverThe model runs on your own Windows PC, which has some very clear advantages, but also some drawbacks that you should be aware of before installing it.

The first big advantage is the absolute privacy of your dataEvery conversation, file, or question you send through the model stays on your computer, inaccessible to any company that can use it to train more models, profile you, or show you ads. If you need to handle sensitive documents, contracts, internal company data, or personal information, this is invaluable.

Closely related to this is the to maximise security and your enjoyment.Since nothing is sent to the cloud, there's no data traffic to third parties, so you drastically reduce the attack surface related to external services. Obviously, you still have to protect the security of your own PC, but at least you eliminate the cloud provider factor.

Another plus point is the economic freedomCommercial services typically operate on a monthly subscription basis (ChatGPT Plus, Gemini Advanced, etc.) or on a pay-per-use basis via API. With GPT-OSS, the model is completely free: you download it once, install it, and that's it—nothing more. recurring fees or artificial limits of messages.

Furthermore, being an open model, you have a degree of control and customization that closed services don't offer. You can adjust parameters, change default behavior, integrate it into your own applications, automate tasks with scripts or connect it to local tools using APIs that run on your machine.

The less appealing aspect comes from the hardware and complexity side. Running a model like this is a demanding task for the team, and Response speed depends heavily on your CPU, GPU, and memory.On a powerful computer, responses can flow quite quickly; on a basic laptop, you'll notice it takes its time.

You also have to assume a certain technical loadYou're the one who installs, configures, updates, and secures the entire system. It's not overly complicated thanks to tools like Ollama or LM Studio, but it's still a more advanced process than simply opening a website and starting to write.

  How to compare two files with the fc command in Windows

Differences between gpt-oss-20b and gpt-oss-120b

Within the GPT-OSS family you will mainly find Two model sizes: gpt-oss-20b and gpt-oss-120bAlthough the names are similar, they are not in the same league in terms of capabilities or, above all, hardware requirements.

gpt-oss-120b is the large modelDesigned for data centers, multi-GPU workstations, or very high-end machines, its performance is close to commercial models like OpenAI o4-mini, but in return, it requires at least 60 GB of VRAM or unified memoryThis excludes virtually all home computers.

At the more accessible end we have gpt-oss-20bThe mid-range model, which is close in capabilities to models like the o3-mini, is designed for consumer devices: it needs 16 GB of VRAM or unified memory to function reasonably and can run on both gaming desktops and many portable with dedicated GPU or in some Mac with Apple Silicon.

In practice, if your intention is to install GPT-OSS on a Windows PC at home or in the office, The realistic candidate is gpt-oss-20bThe 120b is reserved for very specific setups with multiple graphics cards or professional workstations.

Recommended minimum requirements for installing GPT-OSS on Windows

Before starting with downloads and installations, it is advisable to review What does GPT-OSS require to function decently on Windows?The good news is that, for the 20b model, the requirements aren't unreasonable for a modern PC, although they are a bit demanding for older laptops.

As for the operating system, you need Windows 10 or Windows 11 64 bits32-bit versions are completely ruled out, both due to memory limitations and compatibility with the tools we will be using.

Memory is one of the key points. For gpt-oss-20b, it is recommended to have at least 16 GB of RAM This allows for some headroom and ensures system stability while the model is running. Technically, it can boot on systems with 8 GB of RAM, but the experience becomes very limited, and you'll need to close almost everything else to avoid bottlenecking.

If we're talking about gpt-oss-120b, the situation changes quite a bit: minimum 32 GB of RAM and, ideally, moreIn addition to a huge amount of VRAM, it practically makes installing it on a normal Windows PC unfeasible.

As for the processor, you don't need the latest and greatest, but you don't need something prehistoric either. A minimum of [something] is recommended. Intel 4th generation Core i5 or later or an AMD Ryzen 3 or higherThe CPU can handle running the model on its own if you don't have a GPU, but text generation will be considerably slower.

For storage, keep in mind that these models take up a fair amount of space, and it's always a good idea to leave some free space so Windows doesn't get bogged down. An SSD with at least 500 GB free will give you room for GPT-OSS and other models. As a guide, gpt-oss-20b is around 12-13 GB, while gpt-oss-120b can go up to 70 GB depending on the version.

The important part for accelerating generation is the graphics card (GPU)If you want comfortable performance, ideally you should have one. NVIDIA GeForce RTX 3060 or better, or an AMD Radeon RX 6700 or betterPrevious models will still work, but the number of tokens per second will drop and you'll notice slower responses for longer periods.

If you don't have a dedicated GPU, GPT-OSS can run using only the CPU and RAM, although at a significantly slower speed. Integrated GPUs offer some help, but they're nowhere near as good as a modern dedicated graphics card.

Finally, you will need Internet connection only required to download the modelOnce downloaded and installed, it runs completely offline, so you can disconnect the cable or WiFi and continue using the AI ​​as normal.

How to install GPT-OSS on Windows using Ollama (graphical interface and commands)

To avoid grappling directly with lines of code and raw configurations, the easiest approach is to use tools designed to manage language models. Here Ollama is one of the simplest and most polished options For Windows users, whether you want to use a graphical interface or prefer to use the terminal.

Ollama functions as a local LLM "launcher": it is responsible for download, store and run GPT-OSS (and other models like LLaMA, Gemma, or Qwen) with a fairly guided installation. It's free, open source, and available for Windows, macOS, and Linux.

The process begins by going to the official Ollama website and downloading the Windows installer, usually a file called something like OllamaSetup.exeYou save the file, run it, and follow the typical steps of any desktop program: accept the terms and conditions, choose a folder if you want to change it, and wait for it to finish.

  Meta develops Brain2Qwerty, a system that translates thoughts into text

Before installing, check that you meet the following minimum requirements: Windows 10/11 64-bit, at least 8 GB of RAM (although 16 GB is ideal for GPT-OSS), and at least a quad-core x86 CPU (for example, a 4th generation Intel Core i5/i7 or an AMD Ryzen 3/5/7). A dedicated GPU is optional, but highly recommended to accelerate the experience.

Once you open Ollama for the first time, you'll see a chat-like interface. In the center, below the program's logo, there's a box where you can select the model you want to useWhen you expand it, you will find a list of models available both in the cloud and locally.

Among them you will see the tickets gpt-oss:20b and gpt-oss:120bIf you're on a home PC, choose gpt-oss:20b, which is the medium model. Once selected, simply type any message in the text box (for example, a simple "hello") and send it to start Ollama. automatic download of the model.

The download may take from a few seconds to several minutes, depending on your connection, because the file size is around 12,8-13 GBOnce completed, the model loads and you can start chatting with GPT-OSS as if you were in front of ChatGPT, but without leaving your computer.

If you prefer using the command line instead of the graphical interface, Ollama also supports that workflow. PowerShell or the Windows terminal you can use commands like “ollama pull gpt-oss:20b” to download the model and “ollama run gpt-oss:20b” to start it and begin chatting. For the larger model, simply change the name to gpt-oss:120b.

Installing and using GPT-OSS on Windows with LM Studio

If you'd like a more complete environment with more adjustable parameters, you can try LM StudioIt is another tool that allows Download, manage, and run AI models locally It's also available for Windows, macOS, and Linux. We could say that Ollama is more minimalist and straightforward, while LM Studio offers a more visually appealing interface and many extra options.

Regarding requirements, LM Studio for Windows recommends a 64-bit CPU with AVX2 support16 GB of RAM is recommended for working comfortably with 7-8B models, and a GPU is necessary if you want to speed up the entire process. With 8 GB of RAM you can still work with small 3-4B models and short contexts, but for something the size of gpt-oss-20b, it's better to have more than enough.

In terms of storage, each model can take up between 2 and over 20 GB, although there are gpt-oss-20b variants that go considerably higher depending on how they are quantized. It's best to reserve some space. at least 100 GB of free space if you plan to download several models and experiment with different versions.

To install LM Studio, go to its official website, choose the Windows version, and download the executable file (usually around 500-600 MB). Double-click it, select the destination folder (it requires about 1,7 GB of space for the application itself), and click install. When it's finished, you'll see its main interface ready to use.

The next step is to go to the icon of the magnifying glass in the left sidebarwhich opens the model search engine. From there you can explore all models compatible with local execution, including GPT-OSS in its 20b variant.

In the list, locate gpt-oss-20b (it may appear as openai/gpt-oss-20b or a similar path) and click on it. DownloadLM Studio will begin downloading the model; again, the time depends on your connection and the specific size of the version you have chosen.

When the download is complete, go to the section of “Chats” in the left columnThere you'll see a dropdown menu called something like "Select a model to load". Choose gpt-oss-20b and an initial setup screen will open with several sliders and options.

The two most important parameters here are the context length and the number of layers offloaded to the GPU. Context determines how many tokens (words and word fragments, to put it simply) the model can remember within a conversation. The higher you set it, the more memory it consumes and the more calculations per token it performs, which means greater RAM/VRAM usage and potential errors if your hardware is struggling.

The “offload to GPU” option defines how many layers of the model are run directly on the graphics card. The more layers you load onto the GPU, the faster it will generate text.But it will also absorb more VRAM. If you go too far and exhaust the VRAM, performance will plummet or the device won't even boot, so the sensible thing to do is increase it gradually until you find the sweet spot.

Once these details are fine-tuned, you click on “Load model” LM Studio will open a chat very similar to ChatGPT, where you can write your questions, paste texts to have them summarized, or ask for help with code.

  Setting up a development environment with WSL on Windows

Download GPT-OSS from Hugging Face or GitHub and other ways to use it

Although the most convenient option for most Windows users is to use Ollama or LM Studio, OpenAI also offers Direct downloads of GPT-OSS from repositories like Hugging Face and GitHubThis is primarily intended for developers and advanced users who want complete control over the integration.

At Hugging Face you will find the different variants of gpt-oss-20b and gpt-oss-120b, with versions adapted and optimized by the community for different types of hardware and libraries. Each one can vary in size (there are 20b builds that range from about 11 GB to more than 40 GB) and in performance, depending on the type of quantization they use.

The other official download point is GitHub, where OpenAI publishes the necessary resources for working with GPT-OSS, including usage examples, scripts, and documentation for integrating it into projects. From there, you can prepare specific environments, containers, or custom pipelines if you want to set up something more substantial than a simple local chat.

In addition to running on PC, there are also options for Test GPT-OSS on mobile devices Android e iOS using third-party applications. Although OpenAI doesn't recommend any specific one, a popular option is PocketPal AI, which allows you to add models from Hugging Face and run them locally on some mid-to-high-end phones.

The procedure usually involves installing the app, going to the models section, choosing “Add from Hugging Face”, searching for gpt-oss or gpt-oss-20b, and downloading the version that best suits your device's storage and memory. However, on mobile devices, the balance between model weight and performance It's quite delicate, and it's not uncommon to have to choose smaller variants to keep everything running smoothly.

What can you do with GPT-OSS on your Windows PC?

Once you have GPT-OSS installed and working with Ollama or LM Studio, a huge range of practical uses opens up that you can exploit in your day-to-day work. the peace of mind that everything stays on your computer.

From a text perspective, it's perfect for Write articles, summaries, emails, scripts, and publications for social media. You can give it a long document in PDF or in plain text and ask them to extract the key ideas, adapt it to a different audience, improve the tone, or summarize the conclusions in a few lines.

It is also very useful as study and work assistantIt can explain concepts, generate topic outlines, create flashcards, correct essays, or provide practice exercises. Combined with the ability to analyze files you drag into the program window, it becomes a powerful tool for managing reports, academic papers, or technical documentation.

In the field of development, GPT-OSS serves as offline programming partnerIt can review code snippets, flag errors, suggest refactors, generate helper functions, or explain what a script does line by line. It doesn't replace an IDE or a debugger, but it saves you a lot of search time and gives you ideas when you're stuck.

Furthermore, thanks to the local APIs exposed by tools like Ollama, you can integrate GPT-OSS into your own applicationsautomating tasks or creating small, personalized assistants that respond to your own data without needing to rely on external services.

The main limitation is that the model does not have access to real-time informationEverything it knows comes from its previous training, so it's not the best option for checking breaking news, very recent legal changes, or constantly changing data. For that, you'll still need a network-connected model or a traditional search.

In terms of performance, it's normal to notice that GPT-OSS is slower than a ChatGPT hosted in a data center full of GPUsThe longer the context and the more complex the task, the longer the response will take, especially if your GPU or RAM is underpowered. Closing browsers with many tabs or resource-intensive programs while using the model helps everything run more smoothly.

With all this, GPT-OSS becomes a kind of "copilot" that lives on your own Windows PC: Free, private, highly customizable and available even offline. With a little patience for the initial setup and some hardware adjustments, you have a very capable assistant at your fingertips for writing, programming, studying, and experimenting with Generative AI without leaving your desk.

chat gpt
Related article:
How to install ChatGPT on Windows 11 step by step and safely