Changelog - Puzzlet Docs

2025-04-09

JSONL Datasets, Evals/Scoring, and more

JSONL Datasets

Datasets are now supported with JSONL files. This allows you to test your large datasets in bulk against, and supports streaming.

Read Docs

Evals & Scoring

We rolled out our initial evals support. Evals allow you to evaluate your prompts against a set of data, and get a score. More to come here soon.

Read Docs

Other

Consolidating prompts, evals, and datasets into single “files”
Officially rolled out alerts
Some CLI improvements
Minor bug fixes

2025-03-12

Sessions, Alerts, Trace UI Improvements, Onboarding Improvements

Sessions

Sessions provide a way to group related traces together, making it easier to monitor and debug complex workflows in your LLM applications. By organizing traces into sessions, you can track the entire lifecycle of a user interaction or a multi-step process.

Read Docs

Alerts

Now, you can get notified when your application is experiencing increases errors, latency, or costs. Configure alerts to notify you via slack, or a webhook.

Read Docs

Traces UI Improvements

Traces now have a more user-friendly UI, with a focus on providing important information at a glance.

Onboarding Improvements

We’ve improved our onboarding. Now, you can see your dashboard without having to sync your repo first. We also support modular onboarding, so you can skip steps you don’t need.

2025-02-18

Add Trace Examples to Datasets, Load Trace in Prompt, Re-indexing, App UI Improvments, bug fixes

Adding Examples to Datasets

You can now add production trace data to your datasets with a single click.

Read Docs

Adding Examples to Prompts

You can now add production trace examples to your prompts. This allows you to iterate/test against your prompts with real data.

Read Docs

Re-indexing

You can now re-index your prompts, and datasets. This allows you to perform a fresh pull on the content from your synced repository.

App UI Improvements

You can now view your easily app’s repo configuration, including repo names, branch, and more.

2025-01-27

Type Safety, Datasets, and more

Type Safety

Puzzlet aims to provide developers with the best developer experience possible. As part of this, we’ve just added type safety to our platform.

Types can now be generated via our CLI
Fetching prompts from our CDN or AgentMark are now type-safe
Prompts now support run/compile/deserialize functions

Datasets

Datasets now allow you to test your prompts in bulk against a large set of data.

Run your datasets in bulk against your prompts
View previous runs, with inputs/outputs
View traces associated with each run
View high-level metrics for each run

Trace Grouping

Traces can now be grouped based on the trace function, and the component function. Trace groups together at the root level, while component allows for sub-groups.

New function added: trace
New function added: component

CLI Improvements

Our CLI has been improved to provide a better developer experience.

Puzzlet init can optionally create an example app
Added pull-models to walk through adding new models to your platform

Bug Fixes

Fixed a bug which could cause an app’s templates to be deleted when a new app was created
Fixed a bug which could cause some branches not to show up in the UI
Fixed a bug which could prevent newly created local prompts from being synced to the platform

Other

Improved UI for prompts input/output
Paginate traces
Improved UI theme for prompts

2025-01-16

Initial Puzzlet Release

Overview

Puzzlet is the git-based Prompt Engineering Platform that empowers both application developers and prompt engineers to collaborate seamlessly on GenAI products. Puzzlet enables application developers to manage their configuration, prompts, datasets, and evals in a git-based workflow while also providing a hosted platform for seamless collaboration with non-technical team members.

Features

Prompt Management
Observability
Datasets
CLI
Platform Management
Puzzlet SDK

Prompt Management

Puzzlet takes a developer-first approach to prompt management, treating prompts as files that live in your repository while still providing a platform for non-technical team members. All prompts are saved in AgentMark, a markdown-based format that is easy to write and read.

Read Docs

Observability

We build on top of OpenTelemetry for collecting telemetry data from your prompts. This helps you monitor, debug, and optimize your LLM applications in production. We provide traces, logs, metrics, and more.

Read Docs

Datasets

Create datasets to test easily test your prompts in bulk against a large set of data.

Read Docs

CLI

We provide a CLI for initializing your Puzzlet app, customizing it, and deploying it to the cloud. Add new models to your platform with just a single command. You can also develop w/ Puzzlet locally using our serve command.

bash
npx @puzzlet/cli@latest init

Read Docs

Platform Management

Puzzlet offers an intuitive platform for creating new git-synced apps, adding team members with roles, and setting up API keys for users.

Puzzlet SDK

Puzzlet’s SDK is simple and easy to use. We offer features like: one-LOC observability, securely fetching prompts from our CDN, and more.

Read Docs

2025-01-03

Initial AgentMark Release

Features

Initial release of AgentMark
Support for OpenAI, Anthropic, and other LLM providers
MDX-based prompt templating
Type-safe prompt development
Tools and agents support

Documentation

Added comprehensive documentation
Included examples and guides
API reference documentation

Getting Started

Prompts

API

Type Safety

​JSONL Datasets

​Evals & Scoring

​Other

​Sessions

​Alerts

​Traces UI Improvements

​Onboarding Improvements

​Adding Examples to Datasets

​Adding Examples to Prompts

​Re-indexing

​App UI Improvements

​Type Safety

​Datasets

​Trace Grouping

​CLI Improvements

​Bug Fixes

​Other

​Overview

​Features

​Prompt Management

​Observability

​Datasets

​CLI

​Platform Management

​Puzzlet SDK

JSONL Datasets

Evals & Scoring

Other

Sessions

Alerts

Traces UI Improvements

Onboarding Improvements

Adding Examples to Datasets

Adding Examples to Prompts

Re-indexing

App UI Improvements

Type Safety

Datasets

Trace Grouping

CLI Improvements

Bug Fixes

Other

Overview

Features

Prompt Management

Observability

Datasets

CLI

Platform Management

Puzzlet SDK