Azure AI Foundry and AOAI (Azure OpenAI Services) keeps on getting better all the time! The latest addition in Azure AI Foundry (as of April 14, 2025, yesterday) is the GPT-4.1 model, that has a generous 1M context window and cutoff day for June 2024! 1 million tokens understanding “memory”, which translates to roughly 1500 pages in a document! I wonder if the recently published new summarizing feature in Word uses this model with some trick, or just has a really clever trick on summarizing as it should support summarizing documents up to 3 000 pages..

What’s New: GPT-4.1 Overview

GPT-4.1 is the latest iteration of the GPT-4o model, trained to excel at coding and instruction-following tasks. This model will improve the quality of agentic workflows and accelerate the productivity of developers across all scenarios.

Announcing the GPT-4.1 model series for Azure AI Foundry and GitHub developers

The GPT-4.1 context window of 1 million tokens is very generous and awesome. As the model supports various features, it is very usable model for agents, coding and analysis scenarios. No, it is not a reasoning model, if you are looking for analysis focus but this one has a very large context window that does help to take large number of data into account. And this helps especially with coding. If you want to try it out, GPT-4.1 is already available for Copilot in GitHub for public preview.

OpenAI GPT-4.1 is rolling out for all Copilot Plans, including Copilot Free. You can access it through the model picker in Visual Studio Code and on github.com chat. To accelerate your workflow, whether you’re debugging, refactoring, modernizing, testing, or just getting started, select “GPT-4.1 (Preview)” to begin using it.

OpenAI GPT-4.1 now available in public preview for GitHub Copilot and GitHub Models

What is 1M token context good for, are agents. As we are moving more and more towards the world where AI has a memory, and Responses/Assistants API has been implementing that already, the larger context counts. We can continue the conversation for a longer period with more information included.

Key Features of GPT-4.1

What else is included in the list? With 16K token output support there are

Text, image processing

JSON Mode

parallel function calling

Enhanced accuracy and responsiveness

Parity with English text and coding tasks compared to GPT-4 Turbo with Vision

Superior performance in non-English languages and in vision tasks

Support for enhancements

Support for complex structured outputs.

I am very pleased to see superior performance listed for non-English languages and of course complex structured outputs with JSON mode will help big time with agents.

From model descriptions page, details are

Text & image input

Text output

Chat completions API

Responses API

Streaming

Function calling

Structured outputs (chat completions)

What is odd, is that in Learn Max Output Tokens are 32k, and in the model description (when deploying) it is 16K.

Now that I got the model deployed, it will be soon the time to start testing it.

Pricing and testing

Just keep in the mind, that the billing model is different for up to 128K tokens and to large one (up to 1M tokens) context inputs. At the moment, when I was writing this post, there wasn’t any information about the pricing in Azure OpenAI Services Pricing page. Also, I can’t see GPT-4.1-mini nor GPT-4.1-nano models yet in the catalogue. In addition to the standard model, you will get 1M context length with mini and nano as well, but with a lower cost (at the expense of efficiency).

For more information, check out Microsoft’s announcement blog article about GPT-4.1. From the article some additional highlights:

Supervised fine-tuning for GPT-4.1 and 4.1-mini is coming soon (this week)

Improved instruction following: The model excels at following detailed instructions, especially agents containing multiple requests. It is more intuitive and collaborative, making it easier to work with for various applications.

Enhanced coding and instruction following: The model is optimized for better handling of complex technical and coding problems. It generates cleaner, simpler front-end code, accurately identifies necessary changes in existing code, and consistently produces outputs that compile and run successfully.

Just for a quick test, I attached 4.1 to a data source and asked about “create a detailed testing plan, that answers to all identified risks in the project”. There are just two documents: one about a risk management and one with identified risks.

Another test I did, was to ask about improvements to this blog draft, based the plain text version and with a few attached pictures.

And as this blog post was written with a flow, I didn’t use AI to generate the first draft. Asking for better grammar gave me plenty of advice.

I fixed some of these, and that hopefully helped in the readability.

These are just simple tests, but in time I will use this for more advanced scenarios.

Published by Vesa Nopanen

Vesa “Vesku” Nopanen, Principal Consultant and Microsoft MVP (M365 and AI Platform) working on Future Work at Sulava.

I work, blog and speak about Future Work : AI, Microsoft 365, Copilot, Loop, Azure, and other services & platforms in the cloud connecting digital and physical and people together.

I have 30 years of experience in IT business on multiple industries, domains, and roles.
View all posts by Vesa Nopanen



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here