{
  "generated_at": "2026-05-04T12:38:49Z",
  "items": [
    {
      "id": "github|https://github.com/sponsors/msitarzewski",
      "anchor": "cbdd6d289d",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "sponsors/msitarzewski — A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes, and proven deliverables.",
      "headline": "System of AI agents with different specializations for tasks such as frontend development, community engagement, and content validation.",
      "url": "https://github.com/sponsors/msitarzewski",
      "published_at": "2026-05-04T10:51:20Z",
      "discovered_at": "2026-05-04T10:51:20Z",
      "description": "A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes, and proven deliverables.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T10:51:38Z"
    },
    {
      "id": "github|https://github.com/virattt/dexter",
      "anchor": "217dda6a14",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "virattt/dexter — An autonomous agent for deep financial research",
      "headline": "Dexter is an autonomous agent designed for conducting deep financial research.",
      "url": "https://github.com/virattt/dexter",
      "published_at": "2026-05-04T10:51:20Z",
      "discovered_at": "2026-05-04T10:51:20Z",
      "description": "An autonomous agent for deep financial research",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T10:51:45Z"
    },
    {
      "id": "github|https://github.com/fspecii/ace-step-ui",
      "anchor": "98f19e07a0",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "fspecii/ace-step-ui — 🎵 The Ultimate Open Source Suno Alternative - Professional UI for ACE-Step 1.5 AI Music Generation. Free, local, unlimited. Stop paying for Suno!",
      "headline": "Open-source ace-step-ui provides a free, local UI for ACE-Step 1.5 AI music generation.",
      "url": "https://github.com/fspecii/ace-step-ui",
      "published_at": "2026-05-04T10:51:20Z",
      "discovered_at": "2026-05-04T10:51:20Z",
      "description": "🎵 The Ultimate Open Source Suno Alternative - Professional UI for ACE-Step 1.5 AI Music Generation. Free, local, unlimited. Stop paying for Suno!",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T10:51:56Z"
    },
    {
      "id": "github|https://github.com/cocoindex-io/cocoindex",
      "anchor": "6f0e6945d8",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "cocoindex-io/cocoindex — Incremental engine for long horizon agents 🌟 Star if you like it!",
      "headline": "Cocoindex is an incremental engine designed for long horizon agents.",
      "url": "https://github.com/cocoindex-io/cocoindex",
      "published_at": "2026-05-04T10:51:20Z",
      "discovered_at": "2026-05-04T10:51:20Z",
      "description": "Incremental engine for long horizon agents 🌟 Star if you like it!",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T10:52:03Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00005",
      "anchor": "6ff8158295",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference",
      "headline": "Cloud-based inference can match or exceed on-device performance for real-time control tasks like autonomous driving when adequately resourced.",
      "url": "https://arxiv.org/abs/2605.00005",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00005v1 Announce Type: new Abstract: The increasing deployment of deep neural networks (DNNs) in cyber-physical systems (CPS) enhances perception fidelity, but imposes substantial computational demands on execution platforms, posing challenges to real-time control deadlines. Traditional distributed CPS architectures typically favor on-device inference to avoid network variability and contention-induced delays on remote platforms. However, this design choice places significant energy and computational demands on the local hardware. In this work, we revisit the assumption that cloud-based inference is intrinsically unsuitable for latency-sensitive control tasks. We demonstrate that, when provisioned with high-throughput compute resources, cloud platforms can effectively amortize network and queueing delays, enabling them to match or surpass on-device performance for real-time decision-making. Specifically, we develop a formal analytical model that characterizes distributed inference latency as a function of the sensing frequency, platform throughput, network delay, and task-specific safety constraints. We instantiate this model in the context of emergency braking for autonomous driving and validate it through extensive simulations using real-time vehicular dynamics. Our empirical results identify concrete conditions under which cloud-based inference adheres to safety margins more reliably than its on-device counterpart. These findings challenge prevailing design strategies and suggest that the cloud is not merely a feasible option, but often the preferred inference location for distributed CPS architectures. In this light, the cloud is not as distant as traditionally perceived; in fact, it is closer than it appears.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:03:20Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00011",
      "anchor": "8d9a4838db",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "FedACT: Concurrent Federated Intelligence across Heterogeneous Data Sources",
      "headline": "FedACT proposes a resource-aware device scheduling approach for concurrent federated learning across heterogeneous devices, minimizing job completion time while ensuring fair participation.",
      "url": "https://arxiv.org/abs/2605.00011",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00011v1 Announce Type: new Abstract: Federated Learning (FL) enables collaborative intelligence across decentralized data source devices in a privacy-preserving way. While substantial research attention has been drawn to optimizing the learning process for an individual task, real-world applications increasingly require multiple machine learning tasks simultaneously training their models across a shared pool of devices. Naively applying single-FL optimization techniques in multi-FL systems results in suboptimal system performance, particularly due to device heterogeneity and resource inefficiency. To address such a critical open challenge, we introduce {\\em FedACT}, a novel resource heterogeneity-aware device scheduling approach designed to efficiently schedule heterogeneous devices across multiple concurrent FL jobs, with the goal of minimizing their average job completion time (JCT). {\\em FedACT} dynamically assigns devices to FL jobs based on an alignment scoring mechanism that evaluates the compatibility between available resources of devices and resource demands of jobs. Additionally, it incorporates participation fairness to ensure balanced contributions from devices across jobs, further enhancing the accuracy levels of learned global models. An optimal scheduling plan is formulated in {\\em FedACT} by prioritizing devices with higher alignment scores, while ensuring fair participation across jobs. To evaluate the effectiveness of the proposed scheduling algorithm, we carried out comprehensive experiments using diverse FL jobs and benchmark datasets. Experimental results demonstrate that {\\em FedACT} reduces the average JCT by up to 8.3\\(\\times\\) and improves model accuracy by up to 44.5\\%, compared to the state-of-the-art baselines.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:03:27Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00018",
      "anchor": "5b74f07bc7",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "What Physics do Data-Driven MoCap-to-Radar Models Learn?",
      "headline": "Physics-based metrics show MoCap-to-radar models often lack physical consistency despite low reconstruction error; temporal attention crucial for transformer learning.",
      "url": "https://arxiv.org/abs/2605.00018",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00018v1 Announce Type: new Abstract: Data-driven MoCap-to-radar models generate plausible micro-Doppler spectrograms, but do they actually learn the underlying physics? We introduce a physics-based interpretability framework to answer this question via two proposed complementary metrics: one measures alignment between model predictions and the physics-derived Doppler frequency, while the other tests whether predictions preserve the velocity-frequency relationship under velocity intervention. Both metrics require only MoCap input and model predictions, without access to measured radar data. Experiments across several model architectures reveal that low reconstruction error does not guarantee physical consistency: some, but not all, models achieve low error yet perform poorly on the two physics-based metrics. Further analysis shows that temporal attention is critical for transformer-based models to learn the underlying physics.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:03:43Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00020",
      "anchor": "800d5bf89a",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "AirFM-DDA: Air-Interface Foundation Model in the Delay-Doppler-Angle Domain for AI-Native 6G",
      "headline": "Foundation model for 6G wireless systems uses Delay-Doppler-Angle domain to untangle multipath components and reduce computational overhead.",
      "url": "https://arxiv.org/abs/2605.00020",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00020v1 Announce Type: new Abstract: The success of large foundation models is catalyzing a new paradigm for AI-native 6G network design: wireless foundation models for physical layer design. However, existing models often operate on channel state information (CSI) in the space-time-frequency (STF) domain, where distinct multipath components are inherently superimposed and structurally entangled. This hinders the learning of universal channel representation. Meanwhile, their reliance on global attention mechanisms incurs prohibitive computational overhead. In this paper, we propose AirFM-DDA, an Air-interface Foundation Model operating in the Delay-Doppler-Angle (DDA) domain for physicallayer tasks. Specifically, AirFM-DDA reparameterizes CSI from the STF domain into the DDA domain to explicitly resolve multipath components along physically meaningful axes. It employs a window-based attention module augmented with framestructure-aware positional encoding (FS-PE). This window-based attention aligns with locally clustered multipath dependencies while avoiding quadratic-complexity global attention, and FS-PE injects frame-structure priors into network. Extensive experiments demonstrate that AirFM-DDA achieves superior zero-shot generalization across unseen scenarios and datasets, consistently outperforming the baselines on channel prediction and estimation tasks. Compared to the global attention, its window-based attention reduces training and inference costs by nearly an order of magnitude. Moreover, AirFM-DDA maintains robustness under high mobility, large delay spreads, severe noise, and extreme aliasing conditions.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:03:55Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00050",
      "anchor": "51b8aa60ff",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Learning physically grounded traffic accident reconstruction from public accident reports",
      "headline": "Framework reconstructs traffic accidents from public NHTSA reports using multimodal learning, improving accuracy for traffic safety and autonomous driving.",
      "url": "https://arxiv.org/abs/2605.00050",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00050v1 Announce Type: new Abstract: Traffic accidents are routinely documented in textual reports, yet physically grounded accident reconstruction remains difficult because detailed scene measurements and expert reconstructions are scarce, costly and hard to scale. Here we formulate accident reconstruction from publicly accessible reports and scene measurements as a parameterized multimodal learning problem. We construct CISS-REC, a dataset of 6,217 real-world accident cases curated from the NHTSA Crash Investigation Sampling System, and develop a reconstruction framework that grounds report semantics to road topology and participant attributes, reconstructs lane consistent pre-impact motion, and refines collision relevant interactions through localized geometric reasoning and temporal allocation. Our method outperforms representative baselines on CISS-REC, achieving the strongest overall reconstruction fidelity, including improved accident point accuracy and collision consistency. These results show that public accident reports can serve as scalable computational substrates for quantitatively verifiable accident reconstruction, with potential value for traffic safety analysis, simulation and autonomous driving research.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:04:08Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00056",
      "anchor": "df3f29fb1b",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution",
      "headline": "Ensemble machine learning with Gaussian copula transformation improves prediction of groundwater heavy metal pollution.",
      "url": "https://arxiv.org/abs/2605.00056",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00056v1 Announce Type: new Abstract: Groundwater in the Densu Basin is increasingly threatened by heavy metal contamination, but conventional methods fail to capture the statistical complexity and spatial heterogeneity of pollution indicators. A key challenge is modelling the Heavy Metal Pollution Index (HPI), which is typically skewed and affected by correlated contaminants, leading to biased predictions without transformation. This study develops a predictive framework integrating response transformations with nested cross-validated ensemble machine learning. Three transformations (raw, log, and Gaussian copula) were applied to HPI and evaluated across six learners: support vector regression (SVM), $k$-nearest neighbours (k-NN), CART, Elastic Net, kernel ridge regression, and a stacked Lasso ensemble. Raw-scale models produced deceptively high fits (Elastic Net and stacked ensemble $R^2 \\approx 1.0$), suggesting over-optimism. The log transformation stabilised variance (SVM: $R^2 = 0.93$, RMSE $= 0.18$; k-NN: $R^2 = 0.92$, RMSE $= 0.20$). The Gaussian copula gave the most reliable results: stacked ensemble $R^2 = 0.96$ (RMSE $= 0.19$), with other learners maintaining high accuracy. Copula-based models improved residuals and produced spatially plausible maps. DBSCAN clustering revealed Fe and Mn as primary HPI contributors, consistent with regional hydrogeochemistry. Limitations include reliance on random (not spatial) cross-validation and basin-specific scope. Future work should explore spatial validation and other geological settings. Overall, distribution-aware ensembles with clustering diagnostics offer robust, interpretable assessments of groundwater contamination.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:04:20Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00064",
      "anchor": "bd27208a9e",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise",
      "headline": "Generalization bounds for stochastic gradient descent now support adaptive virtual noise geometry based on past optimization trajectories.",
      "url": "https://arxiv.org/abs/2605.00064",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00064v1 Announce Type: new Abstract: Information-theoretic generalization bounds analyze stochastic optimization by relating expected generalization error to the mutual information between learned parameters and training data. Virtual perturbation analyses of SGD add auxiliary Gaussian noise only in the proof, making mutual information tractable while leaving the actual SGD trajectory unchanged. Existing bounds, however, typically require perturbation covariances to be fixed independently of the optimization history, limiting their ability to represent geometries induced by moving gradient statistics, preconditioners, curvature proxies, and other pathwise information. We introduce predictable history-adaptive virtual perturbations, where the perturbation covariance at each iteration may depend on the past real SGD history but not on current or future randomness. This predictability enables a conditional Gaussian relative-entropy argument and yields generalization bounds for SGD with adaptive virtual-noise geometry. The bounds replace fixed sensitivity and gradient-deviation terms with conditional adaptive counterparts, include an output-sensitivity penalty from accumulated perturbation covariance, and reduce the deviation term to a conditional variance only under conditional unbiasedness. Since adaptive covariances may be data-dependent, we separate local Gaussian smoothing from global reference-kernel comparison. The resulting bound includes a covariance-comparison cost measuring the KL price of using an admissible reference geometry different from the actual adaptive covariance. Fixed-noise-style bounds are recovered under admissible synchronization, such as deterministic, public, or prefix-observable covariance rules. The framework recovers fixed isotropic and geometry-aware bounds as special cases while extending virtual perturbation analysis to history-dependent SGD without modifying the algorithm.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:04:29Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00068",
      "anchor": "791a383b17",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Human-in-the-Loop Meta Bayesian Optimization for Fusion Energy and Scientific Applications",
      "headline": "Human-in-the-Loop Meta Bayesian Optimization uses expert knowledge and few-shot learning to optimize experiments in data-scarce scientific domains like fusion energy.",
      "url": "https://arxiv.org/abs/2605.00068",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00068v1 Announce Type: new Abstract: Inertial Confinement Fusion (ICF) holds transformative promise for sustainable, near-limitless clean energy, yet remains constrained by prohibitively high costs and limited experimental opportunities. This paper presents Human-in-the-Loop Meta Bayesian Optimization (HL-MBO), a framework that integrates expert knowledge with few-shot, uncertainty-aware machine learning to accelerate discovery in data-scarce, high-stakes scientific domains. HL-MBO introduces a meta-learned surrogate model with an expert-informed acquisition function to recommend candidate experiments. To foster trust and enable informed decisions, HL-MBO also provides interpretable explanations of its suggestions. We show HL-MBO outperforms current BO methods on ICF energy yield optimization, as well as benchmarks in molecular optimization and critical temperature maximization for superconducting materials.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:04:40Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00069",
      "anchor": "b76a500778",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Soft-MSM: Differentiable Context-Aware Elastic Alignment for Time Series",
      "headline": "Researchers introduce Soft-MSM, a differentiable elastic distance metric for time series that enables gradient-based optimization with context-aware alignment costs.",
      "url": "https://arxiv.org/abs/2605.00069",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00069v1 Announce Type: new Abstract: Elastic distances like dynamic time warping (DTW) are central to time series machine learning because they compare sequences under local temporal misalignment. Soft-DTW is an adaptation of DTW that can be used as a gradient-based loss by replacing the hard minimum in its dynamic-programming recursion with a smooth relaxation. However, this approach does not directly extend to elastic distances whose transition costs depend on the local alignment context. Move-Split-Merge (MSM) is one such distance: it uses context-aware split and merge penalties and has often outperformed DTW in supervised and unsupervised time series machine learning tasks such as classification and clustering. We introduce Soft-MSM, a smooth relaxation of MSM and an elastic alignment loss with context-aware transition costs. Central to the formulation is a smooth gated surrogate for MSM's piecewise split/merge cost, which enables gradients through both the dynamic-programming recursion and the local transition structure. We derive the forward recursion, backward recursion, soft alignment matrix, closed-form gradient, limiting behaviour, and divergence-corrected formulation. Experiments on 112 UCR datasets show that Soft-MSM gives lower MSM barycentre loss than existing MSM barycentre methods, and yields significantly better clustering and nearest-centroid classification performance than Soft-DTW-based alternatives. An implementation is available in the open-source \\texttt{aeon} toolkit.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:19:15Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00070",
      "anchor": "cdcc597e4d",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "CRADIPOR: Crash Dispersion Predictor",
      "headline": "CRADIPOR uses a Rank Reduction Autoencoder to predict numerical dispersion in crash simulations without repeating computations.",
      "url": "https://arxiv.org/abs/2605.00070",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00070v1 Announce Type: new Abstract: We present CRADIPOR, a numerical dispersion prediction tool for automotive crash simulations. Finite Element (FE) crash models are widely used throughout vehicle development, but their predictions are not strictly repeatable because of parallel computation and model complexity. As a result, performance criteria evaluated during post-processing may exhibit significant numerical dispersion, which complicates engineering decision-making. Although dispersion can be estimated by repeating the same simulation, this approach is generally impractical because of its high computational cost. This work therefore investigates a prediction tool that can be applied during routine crash-simulation post-processing without repeating the computation. The proposed approach relies on a Rank Reduction Autoencoder (RRAE) combined with supervised classification in order to identify regions sensitive to numerical dispersion. The comparative analysis suggests that the RRAE-based framework is more effective than the Random Forest baseline on the studied dataset. Among the tested signal representations, wavelet-based and slope-based inputs appear to be the most promising, with slope variations providing the best classification performance. These results support the use of structured latent representations for improving numerical-dispersion detection in automotive crash post-processing.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:19:27Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00082",
      "anchor": "753f5390ca",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Hyperspherical Forward-Forward with Prototypical Representations",
      "headline": "Hyperspherical Forward-Forward reformulates layer objectives as multi-class classification using prototypes, achieving 40x faster inference and improved accuracy over standard Forward-Forward.",
      "url": "https://arxiv.org/abs/2605.00082",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00082v1 Announce Type: new Abstract: The Forward-Forward (FF) algorithm presents a compelling, bio-inspired alternative to backpropagation. However, while efficient in training, it has a computationally prohibitive inference process that requires a separate forward pass for every class that is evaluated. In this work, we introduce the Hyperspherical Forward-Forward (HFF), a novel reformulation that resolves this critical bottleneck. Our core innovation is to reframe the local objective of each layer from a binary goodness-of-fit task to a direct multi-class classification problem within a hyperspherical feature space. We achieve this by learning a set of class-specific, unit-norm prototypes that act as geometric anchors and implicit negatives. This architectural innovation preserves the benefits of local training while enabling weight update and inference in a single forward pass, making it >40x faster than the original FF algorithm. Our method is simple to implement, scales effectively to modern convolutional architectures, and achieves superior accuracy on standard image classification benchmarks, closing the gap with backpropagation. Most notably, we are among the first greedy local-learning methods to report over 25% top-1 accuracy on ImageNet-1k, and 65.96% with transfer learning.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:19:37Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00083",
      "anchor": "ed45d0c854",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Comparative Analysis of Polygon-Based and Global Machine Learning Models for Bus Occupancy Prediction",
      "headline": "Spatially-clustered local machine learning models predict bus ridership as accurately as global models for urban transit systems.",
      "url": "https://arxiv.org/abs/2605.00083",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00083v1 Announce Type: new Abstract: Accurate forecasting of bus ridership (passengers numbers) is crucial for efficient management and optimization of public transport systems. Traditional forecasting models often fail to capture the unique and localized dynamics of different urban areas by treating the entire city as a single, homogeneous region. This paper introduces a novel framework that enhances bus ridership prediction by integrating a spatial clustering methodology with multi-dimensional feature analysis. The proposed framework utilizes a diverse set of data, including bus ridership data (by route number, time, and bus stop) complemented by a variety of open source data, such as spatial features (e.g., attractive destinations), meteorological conditions (e.g., temperature, rainfall), and temporal patterns (e.g., time of day, day of week). By clustering the urban area into distinct regions, based on the principle that bus stops in close proximity share similar ridership characteristics, a separate local forecasting model is trained for each of these clusters. This localized approach demonstrates an accuracy comparable to that of global models. The findings suggest that a spatially-aware, localized modeling strategy is effective for public transport prediction, paving the way for more targeted and efficient service improvements.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:19:51Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00126",
      "anchor": "78ca14c7b3",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting",
      "headline": "SPLICE combines latent diffusion with conformal prediction for time-series imputation, providing reliability guarantees and outperforming baselines in power systems.",
      "url": "https://arxiv.org/abs/2605.00126",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00126v1 Announce Type: new Abstract: Generative models for time-series imputation achieve strong reconstruction accuracy, yet provide no finite-sample reliability guarantees, a critical limitation in power systems where imputed values inform dispatch and planning. We introduce SPLICE (Self-supervised Predictive Latent Inpainting with Conformal Envelopes), a modular framework coupling latent generative imputation with distribution-free, online-adaptive prediction intervals. A JEPA encoder maps daily load segments into a 64-dimensional latent space; a conditional latent bridge with four sampling modes generates candidate gap trajectories; an hourly-conditioned decoder maps back to signal space; and Adaptive Conformal Inference (ACI) wraps the output with coverage-guaranteed prediction bands. The flow-matching variant achieves comparable quality to DDIM in 5--10 ODE steps (5-10x speedup). On thirteen load datasets (nine proprietary, three UCI Electricity, ETTh1), SPLICE achieves the lowest mean Load-only MSE (0.056), winning 9/12 non-degenerate datasets at 91-day gaps and 18/32 across all gap lengths vs. five established baselines, and produces the best CRPS (0.161, -18.3% vs. the strongest competitor). ACI delivers 93--95% empirical coverage, correcting under-coverage failures of up to 7.5 pp observed with static conformal prediction. A pooled JEPA encoder trained on nine feeds transfers to four unseen domains, matching or exceeding per-dataset oracles with only a quick bridge fine-tuning.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:20:02Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00130",
      "anchor": "40e76a549b",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Learning Fingerprints for Medical Time Series with Redundancy-Constrained Information Maximization",
      "headline": "Framework learns fingerprint tokens for medical time series using redundancy-constrained information maximization for interpretable representations.",
      "url": "https://arxiv.org/abs/2605.00130",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00130v1 Announce Type: new Abstract: Learning meaningful representations from medical time series (MedTS) such as ECG or EEG signals is a critical challenge. These signals are often high-dimensional, variable-length and rife with noise. Existing self-supervised approaches, such as Masked Autoencoders (MAEs) are highly effective for pre-training general-purpose encoders. However, they do not explicitly learn compact and semantically interpretable latent representations, typically relying on heuristic aggregation strategies such as global average pooling or a designated [CLS] token. We propose a novel framework that compresses a variable-length MedTS into a fixed-size set of $k$ latent Fingerprint Tokens. Our architecture employs a cross-attention bottleneck to generate these tokens and is trained with a dual-objective function. The first objective is a reconstruction loss, which ensures the tokens are \\textit{sufficient statistics} for the original data. The second, a diversity penalty based on the Total Coding Rate (TCR), explicitly minimizes the redundancy between tokens, encouraging them to become statistically \\textit{disentangled} representations. We present the theoretical justification for our method, framing it as a novel \\textbf{Disentangled Rate-Distortion} problem. This approach produces a low-dimensional, interpretable, and sample-efficient representation, where each token is encouraged to capture an independent factor of variation, paving the way for more robust digital biomarkers.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:20:25Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00133",
      "anchor": "9e717031ac",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Smart Profit-Aware Crop Advisory System: Kisan AI",
      "headline": "Kisan AI crop advisory system incorporates market prices with agronomic data to maximize farmer profitability.",
      "url": "https://arxiv.org/abs/2605.00133",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00133v1 Announce Type: new Abstract: Modern crop advisory systems exhibit a critical limitation termed \\textit{economic blindness}. These systems primarily optimize for biological yield, often overlooking market price, which can lead farmers toward agronomically sound yet financially unviable decisions. In this paper, we develop Kisan AI, a smart profit-aware crop advisory system that resolves the above-mentioned limitation through a research-driven, full-stack application. We train the Random Forest(RF) classifier model on a nine-feature benchmark dataset, the standard seven agronomic attributes augmented with a \\textit{market\\_price} variable, and evaluated against eight baseline models, considering the evaluation matrices, such as, accuracy, precision, recall, F1-score, and Log Loss. The RF model achieves the highest accuracy of 99.3\\% and the lowest Log Loss, confirming that the inclusion of market price as a predictive feature is both valid and impactful. We then implement the RF model within a multilingual progressive Web App alongside a Facebook Prophet six-month price forecasting engine and a MobileNetV2 disease detection module. A nine-language AI chatbot powered by the Anthropic Claude API unifies all modules into a single, mobile-installable platform accessible to farmers across India.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:20:36Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00140",
      "anchor": "467c803977",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Technical Report: Activation Residual Hessian Quantization (ARHQ) for Low-Bit LLM Quantization",
      "headline": "ARHQ is a post-training method that improves low-bit LLM quantization by isolating error-sensitive weights using activation residuals.",
      "url": "https://arxiv.org/abs/2605.00140",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00140v1 Announce Type: new Abstract: We present Activation Residual Hessian Quantization (ARHQ), a post-training weight splitting method designed to mitigate error propagation in low-bit activation-weight quantization. By constructing an input-side residual Hessian from activation quantization residuals (G_x), ARHQ analytically identifies and isolates error-sensitive weight directions into a high-precision low-rank branch. This is achieved via a closed-form truncated SVD on the scaled weight matrix W G^{1/2}_x . Experimental results on Qwen3-4B-Thinking-2507 demonstrate that ARHQ significantly improves layer-wise SNR and preserves downstream reasoning performance on ZebraLogic even under aggressive quantization. The code is available at https://github.com/BeautMoonQ/ARHQ.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:20:48Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00155",
      "anchor": "78968569c4",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback",
      "headline": "Wasserstein distributionally robust regret optimization framework proposed to address reward over-optimization in reinforcement learning from human feedback.",
      "url": "https://arxiv.org/abs/2605.00155",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00155v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) has become a core post-training step for aligning large language models, yet the reward signal used in RLHF is only a learned proxy for true human utility. From an operations research perspective, this creates a decision problem under objective misspecification: the policy is optimized against an estimated reward, while deployment performance is determined by an unobserved objective. The resulting gap leads to reward over-optimization, or Goodharting, where proxy reward continues to improve even after true quality deteriorates. Existing mitigations address this problem through uncertainty penalties, pessimistic rewards, or conservative constraints, but they can be computationally burdensome and overly pessimistic. We propose Wasserstein distributionally robust regret optimization (DRRO) for RLHF. Instead of pessimizing worst-case value as in standard DRO, DRRO pessimizes worst-case regret relative to the best policy under the same plausible reward perturbation. We study the promptwise problem through a simplex allocation model and show that, under an $\\ell_1$ ambiguity set, the inner worst-case regret admits an exact solution and the optimal policy has a water-filling structure. These results lead to a practical policy-gradient algorithm with a simple sampled-bonus interpretation and only minor changes to PPO/GRPO-style RLHF training. The framework also clarifies theoretically why DRRO is less pessimistic than DRO, and our experiments show that DRRO mitigates over-optimization more effectively than existing baselines while standard DRO is systematically over-pessimistic.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:33:36Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00161",
      "anchor": "f1f0a8444a",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Consistent Diffusion Language Models",
      "headline": "Novel discrete diffusion training method using stochastic bridges achieves faster parallel text generation with fewer steps.",
      "url": "https://arxiv.org/abs/2605.00161",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00161v1 Announce Type: new Abstract: Diffusion language models (DLMs) are an attractive alternative to autoregressive models because they promise sublinear-time, parallel generation, yet practical gains remain elusive as high-quality samples still demand hundreds of refinement steps. In continuous domains, consistency training along the probability-flow ODE is a popular recipe to accelerate diffusion. For discrete diffusion, no analogous sample-space ODE exists, making direct adaptation ill-defined. We argue that the natural discrete substitute is not a deterministic trajectory but its stochastic counterpart: the exact posterior bridge, available in closed form for broad corruption families including masked and uniform diffusion. Building on this observation, we introduce Multi-Path Discrete Consistency (MPDC), a new principle that trains a denoiser to be path-invariant in expectation across these stochastic bridges, and instantiate it as the Consistent Diffusion Language Model (CDLM), a single-stage, teacher-free training framework. A single CDLM objective unifies masked diffusion, continuous consistency models, and progressive/discrete distillation as analytic limits or empirical approximations of one common view. Empirically, CDLM establishes a new state of the art on both conditional and unconditional text-generation, consistently outperforming strong base discrete diffusion models and often even multi-stage distilled baselines across sampling budgets, with the largest gains in the few-step regime. Together, these results position CDLM as a principled and scalable foundation for the next generation of fast, high-fidelity discrete generative modeling.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:33:52Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00182",
      "anchor": "4250e5a250",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Towards A Generative Protein Evolution Machine with DPLM-Evo",
      "headline": "New discrete diffusion framework DPLM-Evo explicitly models substitution and insertion/deletion operations to improve protein sequence generation and mutation prediction.",
      "url": "https://arxiv.org/abs/2605.00182",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00182v1 Announce Type: new Abstract: Proteins are shaped by gradual evolution under biophysical and functional constraints. Protein language models learn rich evolutionary constraints from large-scale sequences, and discrete diffusion-based protein language models~(\\eg, DPLMs) are promising for both understanding and generation. However, existing DPLMs typically rely on masking-based absorbing diffusion that contradicts a simple biological intuition: proteins evolve through accumulated edits, not by emerging from masks. Consequently, these frameworks lack explicit pretraining objectives for substitution and insertion/deletion (indel) operations, limiting both optimization-style post-editing and flexible guided generation. To address these limitations, we present DPLM-Evo, an evolutionary discrete diffusion framework that explicitly predicts substitution, insertion, and deletion operations during denoising. DPLM-Evo decouples an upsampled-length latent alignment space from the variable-length observed sequence space, which makes indel-aware generation tractable and enables adaptive scaffold growth throughout the process with negligible computational overhead. To better align substitutions with real evolution, we further introduce a contextualized evolutionary noising kernel that produces biologically informed, context-dependent mutation patterns. Across tasks, DPLM-Evo improves sequence understanding and achieves state-of-the-art mutation effect prediction performance on ProteinGym in the single-sequence setting. It also enables variable-length simulated evolution, and post-editing/optimization of existing proteins via explicit edit trajectories.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:34:03Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00184",
      "anchor": "88f14328ea",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Introducing WARM-VR: Benchmark Dataset for Multimodal Wearable Affect Recognition in Virtual Reality",
      "headline": "WARM-VR is a multimodal wearable sensor dataset for emotion recognition in virtual reality environments, benchmarked with machine learning algorithms from 31 participants.",
      "url": "https://arxiv.org/abs/2605.00184",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00184v1 Announce Type: new Abstract: With the growing integration of human-computer interaction into everyday life, advances in machine learning have enabled systems to better perceive and respond to users' emotional states. Most existing affect recognition datasets focus on static environments, limiting their applicability to immersive multimedia contexts such as Virtual Reality (VR). In this paper, we introduce WARM-VR, a novel publicly available multimodal dataset designed to support affect recognition in immersive, multisensory environments using wearable sensing instrumentation. Data were collected from 31 participants aged 19-37 using wearable sensors: a wristband measuring Blood Volume Pulse (BVP), EDA, skin Temperature, three-axis Acceleration, and a chest strap recording ECG signals. Participants engaged in immersive VR experiences designed to elicit relaxation through a calming beach environment following stress induction via an arithmetic task. These sessions incorporated synchronized multimedia stimuli: visual, auditory, and olfactory. Affective states were assessed subjectively through validated self-report questionnaires and objectively through the analysis of physiological measurements. Statistical analysis of the questionnaires confirmed that VR relaxation significantly reduced negative affect, particularly with olfactory enhancement. Furthermore, we established a benchmark on the dataset using widely recognized machine learning algorithms. The best performance for binary classification from BVP data of valence, was obtained with a CNN and a CNN-Bi-GRU model, both achieving an average F1-score of 0.63 and an AUC of 0.69. For arousal, a lightweight Transformer architecture provided the most balanced results (F1-0 0.54 and F1-1 0.63), outperforming recurrent hybrids. In the relaxation task, a CNN-Bi-GRU model reached the highest overall performance (average F1-score 0.64, AUC 0.69).",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:34:16Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00185",
      "anchor": "141b177780",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Fair Dataset Distillation via Cross-Group Barycenter Alignment",
      "headline": "Cross-group barycenter alignment reduces fairness gaps caused by dataset distillation across demographic groups.",
      "url": "https://arxiv.org/abs/2605.00185",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00185v1 Announce Type: new Abstract: Dataset Distillation aims to compress a large dataset into a small synthetic one while maintaining predictive performance. We show that as different demographic groups exhibit distinct predictive patterns, the distillation process struggles to simultaneously preserve informative signals for all subgroups, regardless of whether group sizes are mildly or severely imbalanced. Consequently, models trained on distilled data can experience substantial performance drops for certain subgroups, leading to fairness gaps. Crucially, these gaps do not disappear by merely correcting group imbalance, since they stem from fundamental mismatches in subgroup predictive patterns rather than from sample-size disparities alone. We therefore formally analyze the interaction between these two sources of bias and cast the solution as identifying a group-imbalance-agnostic barycenter of the predictive information that induces similar representations across all subgroups. By distilling toward this shared aggregate representation, we show that group fairness concerns can be reduced. Our approach is compatible with existing distillation methods, and empirical results show that it substantially reduces bias introduced by dataset distillation.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:34:28Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00193",
      "anchor": "bff0a213fc",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "OTSS: Output-Targeted Soft Segmentation for Contextual Decision-Weight Learning",
      "headline": "OTSS employs soft segmentation to learn context-specific decision weights from logged data, achieving lower regret than hard-partition and mixture approaches.",
      "url": "https://arxiv.org/abs/2605.00193",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00193v1 Announce Type: new Abstract: Many machine learning systems make constrained decisions by optimizing factorized objectives, but the context-specific objective is often treated as fixed. We study contextual decision-weight learning: from logged decisions and proxy outputs, learn an optimizer-facing weight vector w(x) over interpretable decision factors z(x,d), rather than a direct policy or generic predictive score. We propose OTSS, an output-targeted soft-segmentation model that deploys the personalized decision-ready weight vector. At the function-class level, the theory highlights a hard-versus-soft distinction. Hard partitions incur an approximation-estimation tradeoff under overlap, while a realizable fixed-K soft class removes the hard-partition approximation floor and attains a parametric rate. We evaluate OTSS in controlled benchmarks with finite evaluation libraries, where the true weight vector and downstream regret can be computed exactly. In the representative overlap setting, OTSS attains the lowest mean regret among the comparators, including EM mixture regression, the strongest soft-mixture baseline in our comparison; it matches EM on coefficient recovery while running about two orders of magnitude faster. In a matched K=5 benchmark, OTSS remains competitive under hard-routed truth and improves as heterogeneity becomes softer and sample size grows. On a fixed Complete Journey retail anchor with real household covariates and action geometry, OTSS again achieves the lowest mean-regret point estimate.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:34:42Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00195",
      "anchor": "0b26851fd9",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Diversity in Large Language Models under Supervised Fine-Tuning",
      "headline": "Researchers identify that supervised fine-tuning reduces LLM diversity due to low-frequency pattern neglect and knowledge forgetting, proposing TOFU loss to recover diversity while maintaining quality.",
      "url": "https://arxiv.org/abs/2605.00195",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00195v1 Announce Type: new Abstract: Supervised Fine-Tuning (SFT) is essential for aligning Large Language Models (LLMs) with user intent, yet it is believed to suppress generative diversity. Although this reduction is frequently referenced, formal empirical testing of the phenomenon remains limited. The expressiveness of LLMs by itself was addressed by multiple prior methods. Their varying perspectives suggest that deeper analysis could yield further improvements. In this study, we attribute the decline to two primary drivers: the neglect of low-frequency patterns within fine-tuning datasets and the forgetting of preexisting knowledge. Motivated by our theoretical analysis, we develop Tempered Focal (TOFU) loss, a novel objective that addresses both stated challenges simultaneously. Our extensive evaluation confirms at scale that generation breadth narrows after SFT and strengthens the hypothesis explaining this effect. Across multiple models and benchmarks, we demonstrate that TOFU enhances output diversity while preserving high response quality, offering a principled approach to SFT.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:34:51Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00206",
      "anchor": "7e81c26986",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "State Stream Transformer (SST) V2: Parallel Training of Nonlinear Recurrence for Latent Space Reasoning",
      "headline": "State Stream Transformer V2 enables parameter-efficient reasoning in latent space through nonlinear recurrence with parallel training.",
      "url": "https://arxiv.org/abs/2605.00206",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00206v1 Announce Type: new Abstract: Current transformers discard their rich latent residual stream between positions, reconstructing latent reasoning context at each new position and leaving potential reasoning capacity untapped. The State Stream Transformer (SST) V2 enables parameter-efficient reasoning in continuous latent space through an FFN-driven nonlinear recurrence at each decoder layer, where latent states are streamed horizontally across the full sequence via a learned blend. This same mechanism supports continuous latent deliberation per position at inference time, dedicating additional FLOPs to exploring abstract reasoning before committing to a token. A two-pass parallel training procedure resolves the sequential dependency of the recurrence to allow compute-efficient training. Hidden state analysis shows the state stream facilitates reasoning through exploration of distinct semantic basins in continuous latent space, where transitions at content-dependent positions move the model into a substantially different Bayesian posterior, directly influencing the latent space at future positions. We also find, via a learned probe, that at the first generated token position, the latent state already predicts whether the eventual answer will survive or break under additional latent computation for every subsequent position. Co-trained into an existing 27B backbone using only a small dataset of GSM8K examples, the SST delivers a +15.15 point gain over a fine-tuning-matched baseline on out-of-distribution GPQA-Diamond and cuts that same baseline's remaining GSM8K errors by 46%, together showing that the reasoning improvement is attributable to the architectural mechanism rather than scale or training data. On GPQA-Diamond, the resulting 27B SST also achieves higher accuracy than several larger open-weight and proprietary systems, including open-weight models up to 25 times larger.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:35:02Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00222",
      "anchor": "12c2e44f84",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "CompleteRXN: Toward Completing Open Chemical Reaction Databases",
      "headline": "CompleteRXN introduces a benchmark for completing incomplete chemical reactions in USPTO, achieving 99.20% accuracy on standard tests but showing significant performance drops on real-world data.",
      "url": "https://arxiv.org/abs/2605.00222",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00222v1 Announce Type: new Abstract: Chemical reaction datasets such as USPTO suffer from substantial incompleteness, frequently missing byproducts, co-reactants, and stoichiometric coefficients. This limits their applicability and reliability in downstream applications. Here, we introduce CompleteRXN, a large-scale supervised benchmark for reaction completion under realistic missing-data conditions. We construct a dataset of aligned incomplete and atom-balanced reactions by mapping USPTO records to curated mechanistic reactions. We evaluate representative baselines, including a novel encoder-decoder reaction completion model with constrained decoding, the Constrained Reaction Balancer (CRB), and a recent algorithmic method, SynRBL. On our CompleteRXN benchmark, the CRB achieves high performance across splits of increasing difficulty, reaching 99.20% equivalence accuracy on the random split and 91.12% on the extreme out-of-distribution split. SynRBL produces many balanced and chemically plausible completions, but with lower accuracy on the benchmark test splits. Across all methods, performance degrades with increasing incompleteness. We observe a substantial drop when evaluating on reactions outside the benchmark (full uncurated USPTO), highlighting the gap between benchmark performance and practical robustness and motivating future work.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:48:42Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00237",
      "anchor": "6b225825b2",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Bayesian Optimization in Linear Time",
      "headline": "Bayesian optimization method using recursive binary partitioning reduces computational complexity from cubic to linear while improving optimization performance.",
      "url": "https://arxiv.org/abs/2605.00237",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00237v1 Announce Type: new Abstract: Bayesian optimization is a sequential method for minimizing objective functions that are expensive to evaluate and about which few assumptions can be made. By using all gathered data to train a Gaussian process model for the function and adaptively employing a mixture of global exploration and local exploitation, this method has been used for optimization in many fields including machine learning, automotive engineering and reinforcement learning. However, the standard method suffers from two problems: 1) with cubic computational complexity in the training-set size it eventually becomes computationally infeasible to train the model, and 2) globally modeling the objective function is not necessarily optimal given the local nature of minimization. Using flexible and recursive binary partitioning of the search space, we adapt both the modeling and acquisitive aspects of standard Bayesian optimization to work harmoniously with the partitioning scheme, thereby ameliorating both standard shortcomings. We compare our method against a commonly used Bayesian optimization library on seven challenging test functions, ranging in dimensionality from $6$ to $124$, and show that our method achieves superior optimization performance in all tests. In addition our method has linear computational complexity.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:48:50Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00260",
      "anchor": "05afb02d0f",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "NLPOpt-Net: A Learning Method for Nonlinear Optimization with Feasibility Guarantees",
      "headline": "NLPOpt-Net is an unsupervised learning architecture that solves constrained nonlinear optimization problems with guaranteed constraint satisfaction.",
      "url": "https://arxiv.org/abs/2605.00260",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00260v1 Announce Type: new Abstract: Nonlinear Parametric Optimization Network (NLPOpt-Net) is an unsupervised learning architecture to solve constrained nonlinear programs (NLP). Given the structure of an NLP, it learns the parametric solution maps with guaranteed constraint satisfaction. The architecture consists of a backbone neural network (NN) followed by a multilayer ($k$-layered) projection. While the NN drives toward optimality through a loss function consisting of a modified Lagrangian augmented with a consistency loss, the projection ensures feasibility by projecting the NN predictions in the original constraint manifold. Instead of typical distance minimization, our projection exploits local quadratic approximations of the original NLP. Under certain conditions (such as convexity), the projection has a descent property, which improves the NN predictions further. NLPOpt-Net deploys an inversion-free, modified Chambolle-Pock algorithm to solve the constrained quadratic projections during the forward pass and uses the implicit function theorem for efficient backpropagation. The fixed structure of the projection further allows decoupling of the NN and the projection once the training is complete. NLPOpt-Net solves large-scale convex QP, QCQP, NLP, and nonconvex problems with near zero optimality gap and constraint violations reduced to machine precision. Additionally, it provides near accurate prediction of the active sets and corresponding dual variables, thereby enabling a scalable approach for multiparametric programming. Compiling the projection in C provides order of magnitude improvement in inference time compared to JAX. We provide the codes and NLPOpt-Net as a ready to use package that includes GPU support.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:49:00Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00264",
      "anchor": "af767c1ae8",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Pessimism-Free Offline Learning in General-Sum Games via KL Regularization",
      "headline": "KL regularization enables pessimism-free offline multi-agent reinforcement learning, achieving accelerated equilibrium recovery rates in general-sum games.",
      "url": "https://arxiv.org/abs/2605.00264",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00264v1 Announce Type: new Abstract: Offline multi-agent reinforcement learning in general-sum settings is challenged by the distribution shift between logged datasets and target equilibrium policies. While standard methods rely on manual pessimistic penalties, we demonstrate that KL regularization suffices to stabilize learning and achieve equilibrium recovery. We propose General-sum Anchored Nash Equilibrium (GANE), which recovers regularized Nash equilibria at an accelerated statistical rate of $\\widetilde{O}(1/n)$. For computational tractability, we develop General-sum Anchored Mirror Descent (GAMD), an iterative algorithm converging to a Coarse Correlated Equilibrium at the standard rate of $\\widetilde{O}(1/\\sqrt{n}+1/T)$. These results establish KL regularization as a standalone mechanism for pessimism-free offline learning that achieves equivalent or accelerated rates in multi-player general-sum games.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:49:07Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00265",
      "anchor": "b6e0245264",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Polaris: Coupled Orbital Polar Embeddings for Hierarchical Concept Learning",
      "headline": "Polaris, a polar hyperspherical embedding framework, learns hierarchical representations by separating semantic meaning from structure, achieving significant improvements in taxonomy expansion tasks.",
      "url": "https://arxiv.org/abs/2605.00265",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00265v1 Announce Type: new Abstract: Real-world knowledge is often organized as hierarchies such as product taxonomies, medical ontologies, and label trees, yet learning hierarchical representations is challenging due to asymmetric structure and noisy semantics. We introduce Polaris, a polar hyperspherical embedding framework that separates semanticity from hierarchy using angular geometry and radius, enabling the learning of meaning and structure without interference. To map latent representation onto the sphere, we project it to the tangent space at the north pole, apply the exponential map, and learn unit-norm representations using spherical linear layers. Polaris then combines robust local constraints, global regularization that prevents geometric collapse, and uncertainty-aware asymmetric objectives that encourage directional containment. At inference time, Polaris uses structure-guided retrieval to efficiently narrow down candidate parents before final ranking. We evaluate Polaris on different settings of taxonomy expansion - spanning trees, multi-parent DAGs, and multimodal hierarchies, showing consistent improvements of up to ~19 points in top-K retrieval and up to ~60% reduction in mean rank over fourteen strong baselines.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:49:16Z"
    },
    {
      "id": "arxiv-cs.LG|https://arxiv.org/abs/2605.00267",
      "anchor": "a8779c7ef9",
      "source_id": "arxiv-cs.LG",
      "source_name": "arXiv cs.LG",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Jailbroken Frontier Models Retain Their Capabilities",
      "headline": "More capable frontier models experience minimal performance degradation from jailbreaks, with Opus 4.6 losing only 7.7% versus Haiku 4.5 at 33.1%.",
      "url": "https://arxiv.org/abs/2605.00267",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:02Z",
      "description": "arXiv:2605.00267v1 Announce Type: new Abstract: As language model safeguards become more robust, attackers are pushed toward developing increasingly complex jailbreaks. Prior work has found that this complexity imposes a \"jailbreak tax\" that degrades the target model's task performance. We show that this tax scales inversely with model capability and that the most advanced jailbreaks effectively yield no reduction in model capabilities. Evaluating 28 jailbreaks on five benchmarks across Claude models ranging in capability from Haiku 4.5 to Opus 4.6, we find Haiku 4.5 loses an average of 33.1% on benchmark performance when jailbroken, while Opus 4.6 at max thinking effort loses only 7.7%. We also observe that across all models, reasoning-heavy tasks display considerably more degradation than knowledge-recall tasks. Finally, Boundary Point Jailbreaking, currently the strongest jailbreak against deployed classifiers, achieves near-perfect classifier evasion with near-zero degradation across safeguarded models. We recommend that safety cases for frontier models should not rely on a meaningful capability degradation from jailbreaks.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:49:28Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00022",
      "anchor": "7294c4748d",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Putting HUMANS first: Efficient LAM Evaluation with Human Preference Alignment",
      "headline": "Researchers found 50-example subsets effectively evaluate large audio models while regression-weighted subsets better predict human preferences than full benchmarks.",
      "url": "https://arxiv.org/abs/2605.00022",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00022v1 Announce Type: new Abstract: The rapid proliferation of large audio models (LAMs) demands efficient approaches for model comparison, yet comprehensive benchmarks are costly. To fill this gap, we investigate whether minimal subsets can reliably evaluate LAMs while reducing costs and data redundancy. Analyzing 10 subset selection methods with 18 audio models across 40 tasks covering major LAM evaluation dimensions, we show that subsets of just 50 examples (0.3% of data) can achieve over 0.93 Pearson correlation with full benchmark scores. To understand how well these scores align with what practitioners ultimately care about, user satisfaction, we collect 776 human preference ratings from realistic voice assistant conversations, finding that both subsets and full benchmark achieve only 0.85 correlation with human. To better predict preferences, we trained regression models on these selected subsets, achieving 0.98 correlation -- outperforming regression models trained on both random subsets and the full benchmark. This demonstrates that in regression modeling, well-curated subsets outpredict the full benchmark, showing quality over quantity. We open-source these regression-weighted subsets as the HUMANS benchmark, an efficient proxy for LAM evaluation that captures both benchmark performance and user preferences.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:49:41Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00086",
      "anchor": "9629109435",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "NorBERTo: A ModernBERT Model Trained for Portuguese with 331 Billion Tokens Corpus",
      "headline": "Researchers introduce NorBERTo, a modern Portuguese encoder trained on 331 billion tokens, achieving top PLUE and ASSIN 2 results.",
      "url": "https://arxiv.org/abs/2605.00086",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00086v1 Announce Type: new Abstract: High-quality corpora are essential for advancing Natural Language Processing (NLP) in Portuguese. Building on previous encoder-only models such as BERTimbau and Albertina PT-BR, we introduce NorBERTo, a modern encoder based on the ModernBERT architecture, featuring long-context support and efficient attention mechanisms. NorBERTo is trained on Aurora-PT, a newly curated Brazilian Portuguese corpus comprising 331 billion GPT-2 tokens collected from diverse web sources and existing multilingual datasets. We systematically benchmark NorBERTo against Strong baselines on semantic similarity, textual entailment and classification tasks using standardized datasets such as ASSIN 2 and PLUE. On PLUE, NorBERTo-large achieves the best results among the encoder models we evaluated, notably reaching 0.9191 F1 on MRPC and 0.7689 accuracy on RTE. On ASSIN 2, NorBERTo-large attains the highest entailment F1 (~0.904) among all encoders considered, although Albertina-900M and BERTimbau-large still hold an advantage. To the best of our knowledge, Aurora-PT is currently the largest openly available monolingual Portuguese corpus, surpassing previous resources. NorBERTo provides a modern, mid-sized encoder designed for realistic deployment scenarios: it is straight-forward to fine-tune, efficient to serve, and well suited as a backbone for retrieval-augmented generation and other downstream Portuguese NLP systems.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T04:49:56Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00113",
      "anchor": "218a3a082b",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "How Frontier LLMs Adapt to Neurodivergence Context: A Measurement Framework for Surface vs. Structural Change in System-Prompted Responses",
      "headline": "NDBench benchmark finds frontier LLMs significantly adapt outputs when prompted with neurodivergence context, producing longer and more structured responses.",
      "url": "https://arxiv.org/abs/2605.00113",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00113v1 Announce Type: new Abstract: We examine if frontier chat-based large language models (LLMs) adjust their outputs based on neurodivergence (ND) context in system prompts and describe the nature of these adjustments. Specifically, we propose NDBench, a 576-output benchmark involving two frontier models, three system prompt types (baseline, ND-profile assertion, and ND-profile assertion with explicit instructions for adjustments), four canonical ND profiles, and 24 prompts across four categories, one of which involves an adversarial masking strategy. Four trends emerge consistently from our findings. First, LLMs show significant adaptation under ND context, where fully instructed conditions yield lengthier and more structured outputs, characterized by higher token counts, more headings, and more granular steps (p = 0.67) and thus can be considered primary results. NDBench is made publicly available along with its prompts, outputs, code, and other resources, forming a reproducible framework for auditing future LLMs' adaptation to ND awareness.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:03:49Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00116",
      "anchor": "ad19c7e80d",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts",
      "headline": "Researchers introduce ViLegalNLI, the first large-scale Vietnamese legal NLI dataset containing 42,012 annotated premise-hypothesis pairs from statutory documents.",
      "url": "https://arxiv.org/abs/2605.00116",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00116v1 Announce Type: new Abstract: In this article, we introduce ViLegalNLI, the first large-scale Vietnamese Natural Language Inference (NLI) dataset specifically constructed for the legal domain. The dataset consists of 42,012 premise-hypothesis pairs derived from official statutory documents and annotated with binary inference labels (Entailment and Non-entailment). It covers multiple legal domains and reflects realistic legal reasoning scenarios characterized by structured logic, conditional clauses, and domain-specific terminology. To construct ViLegalNLI, we propose a semi-automatic data generation framework that integrates large language models for controlled hypothesis generation and systematic quality validation procedures. The framework incorporates artifact mitigation strategies and cross-model validation to improve annotation reliability and ensure legal consistency. The resulting dataset captures diverse reasoning patterns, including paraphrasing, logical implication, and legally invalid inferences, thereby providing a comprehensive benchmark for Vietnamese legal inference tasks. We conduct extensive experiments on the ViLegalNLI using multilingual models, Vietnamese-specific pretrained language models, and instruction-tuned large language models. The results show that few-shot LLM configurations consistently achieve superior performance, while performance is significantly influenced by hypothesis length, lexical overlap, and reasoning complexity. Cross-domain evaluations further reveal the challenges of generalizing legal inference across distinct legal fields. Overall, ViLegalNLI establishes a foundational benchmark for Vietnamese legal NLI and supports future research in legal reasoning, statutory text understanding, and the development of reliable AI systems for legal analysis and decision support. The dataset is publicly available for research purposes.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:03:59Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00119",
      "anchor": "0b26aa027a",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues",
      "headline": "New cultural benchmark covering 13 Arabic-speaking countries reveals significant LLM performance gaps between Modern Standard Arabic and dialects.",
      "url": "https://arxiv.org/abs/2605.00119",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00119v1 Announce Type: new Abstract: There is a significant gap in evaluating cultural reasoning in LLMs using conversational datasets that capture culturally rich and dialectal contexts. Most Arabic benchmarks focus on short text snippets in Modern Standard Arabic (MSA), overlooking the cultural nuances that naturally arise in dialogues. To address this gap, we introduce ArabCulture-Dialogue, a culturally grounded conversational dataset covering 13 Arabic-speaking countries, in both MSA and each country's respective dialect, spanning 12 daily-life topics and 54 fine-grained subtopics. We utilize the dataset to form three benchmarking tasks: (i) multiple-choice cultural reasoning, (ii) machine translation between MSA and dialects, and (iii) dialect-steering generation. Our experiments indicate that the performance gap between MSA and Arabic dialects still exists, whereby the models perform worse on all three tasks in the dialectal setup, compared to the MSA one.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:04:12Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00143",
      "anchor": "71313f7b34",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Timing is Everything: Temporal Scaffolding of Semantic Surprise in Humor",
      "headline": "Temporal features outweigh semantic incongruity in humor appreciation, with strategic pauses before punchlines driving audience engagement.",
      "url": "https://arxiv.org/abs/2605.00143",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00143v1 Announce Type: new Abstract: Humor is a fundamental cognitive phenomenon in which humans derive pleasure from the expectation violations and their resolution, exemplifying the brain's dynamic capacity for predictive processing. Classical humor theories emphasize semantic incongruity as the primary driver of amusement, yet overlook temporal dynamics despite comedians' intuition that \"timing is everything.\" The extent to which temporal structure contributes to humor appreciation and how it interacts with semantic content remains poorly understood. Here, we propose the Dual Prediction Violation (DPV) framework to capture the interplay between content and timing. By analyzing 828 professional Chinese stand-up performances, we show that temporal features substantially outweigh semantic incongruity in predicting audience appreciation. Specifically, we find that peak semantic violations matter more than average incongruity levels, and pauses systematically lengthen before high-surprise punchlines--a strategic coupling that distinguishes successful from unsuccessful performances. These findings reframe humor as temporally scaffolded, where timing and semantic content operate in strategic coordination rather than independently. Our DPV framework bridges humor theory with predictive processing, demonstrating that temporal structure plays a central role in naturalistic humor appreciation with implications for understanding multi-scale prediction integration in linguistic processing.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:04:22Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00199",
      "anchor": "b959d4f846",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners",
      "headline": "RSAT method trains small language models to answer table questions with step-by-step reasoning backed by cell-level citations, achieving 3.7x improvement in faithfulness.",
      "url": "https://arxiv.org/abs/2605.00199",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00199v1 Announce Type: new Abstract: When a language model answers a table question, users have no way to verify which cells informed which reasoning steps. We introduce RSAT, a method that trains small language models (SLMs, 1-8B) to produce step-by-step reasoning with cell-level citations grounded in table evidence. Phase 1 (SFT) teaches a structured JSON output format from verified reasoning traces. Phase 2 (GRPO) optimizes a composite reward centered on NLI-based faithfulness, alongside citation validity and parsimony. Across six models from two families-Qwen 2.5 (1.5B/3B/7B) and Llama 3 (1B/3B/8B)-RSAT improves faithfulness 3.7$\\times$ over SFT alone (0.224$\\rightarrow$0.826), with near-perfect citation validity (0.992). Post-hoc attribution collapses below 13% format success, confirming that attribution must be integrated into reasoning, not retrofitted. Ablations show the faithfulness reward is essential: removing it drops faithfulness from 0.97 to 0.03.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:04:31Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00200",
      "anchor": "9b587ffd1e",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Confidence Estimation in Automatic Short Answer Grading with LLMs",
      "headline": "Study proposes hybrid confidence framework combining model-based signals with dataset uncertainty for more reliable LLM-based automatic short answer grading.",
      "url": "https://arxiv.org/abs/2605.00200",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00200v1 Announce Type: new Abstract: Automatic Short Answer Grading (ASAG) with generative large language models (LLMs) has recently demonstrated strong performance without task-specific fine-tuning, while also enabling the generation of synthetic feedback for educational assessment. Despite these advances, LLM-based grading remains imperfect, making reliable confidence estimates essential for safe and effective human-AI collaboration in educational decision-making. In this work, we investigate confidence estimation for ASAG with LLMs by jointly considering model-based confidence signals and dataset-derived uncertainty. We systematically compare three model-based confidence estimation strategies, namely verbalizing, latent, and consistency-based confidence estimation, and show that model-based confidence alone is insufficient to reliably capture uncertainty in ASAG. To address this limitation, we propose a hybrid confidence framework that integrates model-based confidence signals with an explicit estimate of dataset-derived aleatoric uncertainty. Aleatoric uncertainty is operationalized by clustering semantically embedded student responses and quantifying within-cluster heterogeneity. Our results demonstrate that the proposed hybrid confidence measure yields more reliable confidence estimates and improves selective grading performance compared to single-source approaches. Overall, this work advances confidence-aware LLM-based grading for human-in-the-loop assessment, supporting more trustworthy AI-assisted educational assessment systems.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:04:41Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00226",
      "anchor": "84ef6c3631",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Why Do LLMs Struggle in Strategic Play? Broken Links Between Observations, Beliefs, and Actions",
      "headline": "LLMs show gaps between internal beliefs and actions in strategic games; beliefs degrade under reasoning despite exceeding verbal accuracy.",
      "url": "https://arxiv.org/abs/2605.00226",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00226v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly tasked with strategic decision-making under incomplete information, such as in negotiation and policymaking. While LLMs can excel at many such tasks, they also fail in ways that are poorly understood. We shed light on these failures by uncovering two fundamental gaps in the internal mechanisms underlying the decision-making of LLMs in incomplete-information games, supported by experiments with open-weight models Llama 3.1, Qwen3, and gpt-oss. First, an observation-belief gap: LLMs encode internal beliefs about latent game states that are substantially more accurate than their own verbal reports, yet these beliefs are brittle. In particular, the belief accuracy degrades with multi-hop reasoning, exhibits primacy and recency biases, and drifts away from Bayesian coherence over extended interactions. Second, a belief-action gap: The implicit conversion of internal beliefs into actions is weaker than that of the beliefs externalized in the prompt, yet neither belief-conditioning consistently achieves higher game payoffs. These results show how analyzing LLMs' internal processes can expose systematic vulnerabilities that warrant caution before deploying LLMs in strategic domains without robust guardrails.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:05:02Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00227",
      "anchor": "c701a16816",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Persona-Grounded Safety Evaluation of AI Companions in Multi-Turn Conversations",
      "headline": "Researchers present a scalable safety evaluation framework for AI companions, demonstrating that Replika exhibits a narrow emotional range and normalizes unsafe content like self-harm.",
      "url": "https://arxiv.org/abs/2605.00227",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00227v1 Announce Type: new Abstract: There are growing concerns about the risks posed by AI companion applications designed for emotional engagement. Existing safety evaluations often rely on self-reported user data or interviews, offering limited insights into real-time dynamics. We present the first end-to-end scalable framework for controlled simulation and safety evaluation of multi-turn interactions with AI companion applications. Our framework integrates four key components: persona construction with clinical and psychometric validation, persona-specific scenario generation, scenario-driven multi-turn simulation with a dialogue refinement module that preserves persona fidelity, and harm evaluation. We apply this framework to evaluate how Replika, a widely used AI companion app, responds to high-risk user groups. We construct 9 personas representing individuals with depression, anxiety, PTSD, eating disorders, and incel identity, and collect 1,674 dialogue pairs across 25 high-risk scenarios. We combine emotion modeling and LLM-assisted utterance-and harm-level classification to analyze these exchanges. Results show that Replika exhibits a narrow emotional range dominated by curiosity and care, while frequently mirroring or normalizing unsafe content such as self-harm, disordered eating, and violent-fantasy narratives. These findings highlight how controlled persona simulations can serve as a scalable testbed for evaluating safety risks in AI companions.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:05:13Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00238",
      "anchor": "f1a3d452f5",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Estimating LLM Grading Ability and Response Difficulty in Automatic Short Answer Grading via Item Response Theory",
      "headline": "Researchers introduce item response theory framework for evaluating LLM-based automatic short answer grading, revealing performance differences across response difficulty levels.",
      "url": "https://arxiv.org/abs/2605.00238",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00238v1 Announce Type: new Abstract: Automated short answer grading (ASAG) with large language models (LLMs) is commonly evaluated with aggregate metrics such as macro-F1 and Cohen's kappa. However, these metrics provide limited insight into how grading performance varies across student responses of differing grading difficulty. We introduce an evaluation framework for LLM-based ASAG based on item response theory (IRT), which models grading correctness as a function of latent grader ability and response grading difficulty. This formulation enables response-level analysis of where LLM graders succeed or fail and reveals robustness differences that are not visible from aggregate scores alone. We apply the framework to 17 open-weight LLMs on the SciEntsBank and Beetle benchmarks. The results show that even models with similar overall performance differ substantially in how sharply their grading accuracy declines as response difficulty increases. In addition, confusion patterns show that errors on difficult responses concentrate disproportionately on the \\texttt{partially\\_correct\\_incomplete} label, indicating a tendency toward intermediate-label collapse under ambiguity. To characterize difficult responses, we further analyze semantic and linguistic correlates of estimated difficulty. Across both datasets, higher difficulty is associated with weaker semantic alignment to the reference answer, stronger contradiction signals, and greater semantic isolation in embedding space. Overall, these results show that item response theory offers a useful framework for evaluating LLM-based ASAG beyond aggregate performance measures.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:18:55Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00253",
      "anchor": "169a12b0f8",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Lost in State Space: Probing Frozen Mamba Representations",
      "headline": "Mamba frozen representations fail to produce semantic sentence summaries without fine-tuning due to severe anisotropy and representational collapse.",
      "url": "https://arxiv.org/abs/2605.00253",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00253v1 Announce Type: new Abstract: Mamba's recurrent state h_t is, by construction, a compressed summary of every token seen so far. This raises a tempting hypothesis: if we extract token-level outputs y_t at fixed patch boundaries, we obtain semantic sentence summaries for free, with no pooling head, no fine-tuning, and no [CLS] token. We test this hypothesis carefully. Across five benchmarks (SST-2, CoLA, MRPC, STS-B, IMDb), we compare four strategies for extracting frozen sentence representations from a pretrained Mamba-130M backbone under a strict frozen-feature probing protocol, using three random seeds where computationally feasible. The results do not support the hypothesis: patch boundary readouts do not consistently outperform simple mean pooling. We identify and quantify two structural pathologies: severe anisotropy (mean pairwise cosine similarity 0.9999, std 0.000044) and representational collapse in the raw final SSM state (MCC = 0.000 on CoLA across all three seeds, confirmed via confusion matrix). We further propose orthogonal injection, a modified recurrence that constrains new information per",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:19:06Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00257",
      "anchor": "c71d497e2a",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Retrieval-Augmented Reasoning for Chartered Accountancy",
      "headline": "CA-ThinkFlow RAG framework achieves 68.75% of GPT-4o/Claude performance on Indian chartered accountancy benchmarks using quantized models.",
      "url": "https://arxiv.org/abs/2605.00257",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00257v1 Announce Type: new Abstract: The inception of Large Language Models (LLMs) has catalyzed AI adoption in the finance sector, yet their reliability in complex, jurisdiction-specific tasks like Indian Chartered Accountancy (CA) remains limited. The models display difficulty in executing numerical tasks which require multiple steps while also needing advanced knowledge about legal regulations and the method of scaling their operations is not feasible in settings which have limited access to resources. We present CA-ThinkFlow as a parameter-efficient Retrieval-Augmented Generation (RAG) framework which operates with a 14B, 4-bit-quantized reasoning model, 14B-DeepSeek-R1, and a layout-aware Docling extraction system which maintains document structure during extraction. CA-ThinkFlow uses a basic RAG method which automatically adds retrieved information into the prompt, while it depends on the model's built-in Chain-of-Thought (CoT) functions to create context and produce correct answers. The system we developed system operates at performance levels which match large proprietary models when we tested it on the multi-level CA-Ben benchmark, achieving Scholastic Reliability Coefficient (SRC) results which equal 68.75\\% of GPT-4o and Claude 3.5 Sonnet. The framework shows high efficiency and strength in handling parameters, but essential reasoning abilities fail to process complex regulatory texts which exist in fields such as Taxation.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:19:17Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00269",
      "anchor": "3474bf95f0",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "How Language Models Process Out-of-Distribution Inputs: A Two-Pathway Framework",
      "headline": "Study reveals language model OOD detection methods are confounded by sequence length; proposes two-pathway framework using embeddings and hidden-state trajectories.",
      "url": "https://arxiv.org/abs/2605.00269",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00269v1 Announce Type: new Abstract: Recent white-box OOD detection methods for LLMs -- including CED, RAUQ, and WildGuard confidence scores -- appear effective, but we show they are structurally confounded by sequence length (|r| >= 0.61) and collapse to near-chance under length-matched evaluation. Even raw attention entropy (mean H(alpha) across heads and layers), a natural baseline we include for completeness, shows the same confound. The confound stems from attention's Theta(log T) dependence on input length. To identify genuine OOD signals after deconfounding, we propose a two-pathway framework: embeddings capture what text is about (effective for topic shifts), while the processing trajectory -- hidden-state evolution across layers -- captures how the model processes input. The relative power of each pathway varies along a vocabulary-transparency spectrum: embedding methods excel on vocabulary-distinctive OOD, while trajectory features detect covert-intent inputs that share vocabulary with normal text (0.721 avg AUROC; Jailbreak: 0.850). Three evidence lines support this framework: (1) a crossover between k-NN and trajectory scoring across 6 tasks, where each pathway wins on different OOD types; (2) a per-layer analysis showing that layer-0 k-NN signal is almost entirely a length artifact (Jailbreak: 0.759 raw -> 0.389 matched) -- processing constructs genuine OOD signal from near-chance embeddings; and (3) circuit attribution showing adversarial tasks engage attention circuits more than semantic tasks (p = 0.022; Jailbreak patching p < 0.001), with partial cross-model replication. Code release upon publication.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:19:26Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00270",
      "anchor": "722dd92ef2",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Are You the A-hole? A Fair, Multi-Perspective Ethical Reasoning Framework",
      "headline": "A neuro-symbolic framework using MaxSAT resolves conflicting moral judgments by converting natural language explanations into logical constraints, outperforming majority voting.",
      "url": "https://arxiv.org/abs/2605.00270",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00270v1 Announce Type: new Abstract: Standard methods for aggregating natural language judgments, such as majority voting, often fail to produce logically consistent results when applied to high-conflict domains, treating differing opinions as noise. We propose a neuro-symbolic aggregation framework that formalizes conflict resolution through Weighted Maximum Satisfiability (MaxSAT). Our pipeline utilizes a language model to map unstructured natural language explanations into interpretable logical predicates and confidence weights. These components are then encoded as soft constraints within the Z3 solver, transforming the aggregation problem into an optimization task that seeks the maximum consistency across conflicting testimony. Using the Reddit r/AmItheAsshole forum as a case study in large-scale moral disagreement, our system generates logically coherent verdicts that diverge from popularity-based labels 62% of the time, corroborated by an 86% agreement rate with independent human evaluators. This study demonstrates the efficacy of coupling neural semantic extraction with formal solvers to enforce logical soundness and explainability in the aggregation of noisy human reasoning.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:19:35Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00294",
      "anchor": "df957d4335",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "What Don't You Understand? Using Large Language Models to Identify and Characterize Student Misconceptions About Challenging Topics",
      "headline": "LLMs identify student misconceptions in online biomedical courses by analyzing quiz data, response patterns, and lecture transcripts.",
      "url": "https://arxiv.org/abs/2605.00294",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00294v1 Announce Type: new Abstract: This study presents a systematic approach to identifying and characterizing student misconceptions in online learning environments through a novel combination of quantitative performance analysis and large language model (LLM) assessment. We analyzed data from 9 course periods across 5 online biomedical science courses, encompassing 3,802 medical student enrollments. Using data from 40-50 topic-focused quizzes per course, we developed a two-stage methodology. First, we identified challenging central topics using quiz-level performance metrics. Second, we employed LLMs to characterize the underlying misconceptions in these high-priority areas. By examining student performance on first attempts across primarily multiple-choice questions (MCQs), we identified consistently challenging topics that were also central to course objectives. We then leveraged recent advances in generative AI to analyze three distinct data sources in combination: quiz question content, student response patterns, and lecture transcripts. This approach revealed actionable insights about student misconceptions that were not apparent from performance data alone. The quality of the LLM-identified misconceptions was rated as excellent by subject matter experts. We also conducted teacher interviews to assess the perceived utility of our topic identification method. Faculty found that data-driven identification of challenging topics was valuable and corroborated their own classroom observations. This methodology provides a scalable approach to characterizing student difficulties in learning environments where quizzes are used. Our findings demonstrate the potential for targeted and potentially personalized interventions in future course iterations, with clear pathways for measuring intervention effectiveness through follow-up quiz performance.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:19:47Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00318",
      "anchor": "438671d468",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Structure-Aware Chunking for Tabular Data in Retrieval-Augmented Generation",
      "headline": "Researchers propose structure-aware chunking method for tabular data in RAG systems that reduces chunk count by up to 56% and improves retrieval performance.",
      "url": "https://arxiv.org/abs/2605.00318",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00318v1 Announce Type: new Abstract: Tabular documents such as CSV and Excel files are widely used in enterprise data pipelines, yet existing chunking strategies for retrieval-augmented generation (RAG) are primarily designed for unstructured text and do not account for tabular structure. We propose a structure-aware tabular chunking (STC) framework that operates on row-level units by constructing a hierarchical Row Tree representation, where each row is encoded as a key-value block. STC performs token-constrained splitting aligned with structural boundaries and applies overlap-free greedy merging to produce dense, non-overlapping chunks. This design preserves semantic relationships between fields within a row while improving token utilization and reducing fragmentation. Across evaluations on the MAUD dataset, STC reduces chunk count by up to 40% and 56% compared to standard recursive and key-value based baselines, respectively, while improving token utilization and processing efficiency. In retrieval benchmarks, STC improves MRR from 0.3576 to 0.5945 in a hybrid setting and increases Recall@1 from 0.366 to 0.754 in BM25-only retrieval. These results demonstrate that preserving structure during chunking improves retrieval performance, highlighting the importance of structure-aware chunking for RAG over tabular data.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:19:55Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00326",
      "anchor": "7f39c037f3",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Prompt-Induced Score Variance in Zero-Shot Binary Vision-Language Safety Classification",
      "headline": "Vision-language safety classifiers produce inconsistent scores across semantically equivalent prompts; prompt averaging improves reliability and calibration.",
      "url": "https://arxiv.org/abs/2605.00326",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00326v1 Announce Type: new Abstract: Single-prompt first-token probabilities from zero-shot vision-language model (VLM) safety classifiers are treated as decision scores, but we show they are unreliable under semantically equivalent prompt reformulation: even when the binary label is constrained to a fixed output position, equivalent prompts can induce materially different unsafe probabilities for the same sample. Across multimodal safety benchmarks and multiple VLM families, cross-prompt variance is strongly associated with prompt-level disagreement and higher error, making it a useful fragility diagnostic. A training-free mean ensemble improves NLL on all 14 dataset-model evaluation pairs and ECE on 12/14 relative to a train-selected single-prompt baseline, and wins more head-to-head NLL comparisons than labeled temperature scaling, Platt scaling, and isotonic regression applied to the same prompt. Ranking gains are consistent against the train-selected baseline on both AUROC and AUPRC, and against the full 15-prompt distribution remain consistent on AUPRC while softening on AUROC. Labeled calibration on top of the mean provides further gains when labels are available, identifying prompt averaging as a strong label-free first stage rather than a replacement for calibration. We frame this as a reliability stress test for zero-shot VLM first-token safety scores and recommend prompt-family evaluation with mean aggregation as a standard label-free reliability baseline.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:20:05Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00336",
      "anchor": "1e4326a066",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Budget-Aware Routing for Long Clinical Text",
      "headline": "RCD algorithm selects clinical text subsets under token budgets to reduce LLM deployment costs while balancing relevance, coverage, and diversity.",
      "url": "https://arxiv.org/abs/2605.00336",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00336v1 Announce Type: new Abstract: A key challenge for large language models is token cost per query and overall deployment cost. Clinical inputs are long, heterogeneous, and often redundant, while downstream tasks are short and high stakes. We study budgeted context selection, where a subset of document units is chosen under a strict token budget so an off-the-shelf generator can meet fixed cost and latency constraints. We cast this as a knapsack-constrained subset selection problem with two design choices, unitization that defines document segmentation and selection that determines which units are kept. We propose \\textbf{RCD}, a monotone submodular objective that balances relevance, coverage, and diversity. We compare sentence, section, window, and cluster-based unitization, and introduce a routing heuristic that adapts to the budget regime. Experiments on MIMIC discharge notes, Cochrane abstracts, and L-Eval show that optimal strategies depend on the evaluation setting. Positional heuristics perform best at low budgets in extractive tasks, while diversity-aware methods such as MMR improve LLM generation. Selector choice matters more than unitization, with cluster-based grouping reducing performance and other schemes behaving similarly. ROUGE saturates for LLM summaries, while BERTScore better reflects quality differences. We release our code at https://github.com/stone-technologies/ACL_budget_paper.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:34:04Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00342",
      "anchor": "87ec406495",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Making Every Verified Token Count: Adaptive Verification for MoE Speculative Decoding",
      "headline": "EVICT adaptively truncates draft trees in MoE speculative decoding to reduce verification costs, achieving 2.35x speedup over standard autoregressive generation.",
      "url": "https://arxiv.org/abs/2605.00342",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00342v1 Announce Type: new Abstract: Tree-based speculative decoding accelerates autoregressive generation by verifying multiple draft candidates in parallel, but this advantage weakens for sparse Mixture-of-Experts (MoE) models. As the draft tree grows, different branches activate different experts, expanding the union of activated experts and substantially increasing target-side verification cost. We propose EVICT, a training-free, hyperparameter-free, and lossless adaptive verification method for MoE speculative decoding. EVICT makes every verified token count by truncating the draft tree before target verification and retaining only the cost-effective prefix. It leverages fine-grained drafter signals to estimate candidate benefit, combines them with offline-profiled verification cost, and remains highly compatible with the high-performance graph-based serving framework SGLang. Extensive experiments on diverse MoE backbones and benchmarks show that EVICT achieves up to 2.35x speedup over autoregressive decoding and an average 1.21x speedup over the state-of-the-art baseline EAGLE-3, while significantly reducing unnecessary expert activations during verification.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:34:18Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00356",
      "anchor": "128ad8926b",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "MemRouter: Memory-as-Embedding Routing for Long-Term Conversational Agents",
      "headline": "MemRouter uses embedding-based routing instead of autoregressive generation to improve memory management in conversational agents, achieving higher F1 scores and lower latency.",
      "url": "https://arxiv.org/abs/2605.00356",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00356v1 Announce Type: new Abstract: Long-term conversational agents must decide which turns to store in external memory, yet recent systems rely on autoregressive LLM generation at every turn to make that decision. We present MemRouter, a write-side memory router that decouples memory admission from the downstream answer backbone and replaces per-turn memory-management decoding with an embedding-based routing policy. MemRouter encodes each turn together with recent context, projects the resulting embeddings through a frozen LLM backbone, and predicts whether the turn should be stored using lightweight classification heads while training only 12M parameters. Under a controlled matched-harness comparison on LoCoMo, where the retrieval pipeline, answer prompts, and QA backbone (Qwen2.5-7B) are held identical, MemRouter outperforms an LLM-based memory manager on every question category (overall F1 52.0 vs 45.6, non-overlapping 95% CIs) while reducing memory-management p50 latency from 970ms to 58ms. Descriptive factorial averaging further shows that learned admission improves mean F1 by +10.3 over random storage, category-specific prompting adds +5.2 over a generic prompt, and retrieval contributes +0.7. These results suggest that write-side memory admission can be learned by a small supervised router, while answer generation remains a separate downstream component in long-horizon conversational QA.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:34:26Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00358",
      "anchor": "98cdb0f4ef",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "From Backward Spreading to Forward Replay: Revisiting Target Construction in LLM Parameter Editing",
      "headline": "Study proposes forward-propagation to replace backward spreading in LLM parameter editing, achieving more accurate targets with same computational cost.",
      "url": "https://arxiv.org/abs/2605.00358",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00358v1 Announce Type: new Abstract: LLM parameter editing methods commonly rely on computing an ideal target hidden-state at a target layer (referred as anchor point) and distributing the target vector to multiple preceding layers (commonly known as backward spreading) for cooperative editing. Although widely used for a long time, its underlying basis have not been systematically investigated. In this paper, we first conduct a systematic study of its foundations, which helps clarify its capability boundaries, practical considerations, and potential failure modes. Then, we propose a simple and elegant alternative that replaces backward spreading with forward-propagation. Instead of optimizing the target at the last editing layer, we optimize the anchor point at the first editing layer, and then propagate it forward to obtain accurate and mutually compatible target hidden-states for all subsequent editing layers. This approach achieves the same computational complexity as existing methods while producing more accurate layer-wise targets. Our method is simple, without interfering with either the computation of the initial target hidden state or any other components of the subsequent editing pipeline, and thus constituting a benefit for a wide range of LLM parameter editing methods.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:34:39Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00364",
      "anchor": "da1609ceb6",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Unlearning What Matters: Token-Level Attribution for Precise Language Model Unlearning",
      "headline": "TokenUnlearn selectively targets critical tokens for machine unlearning in language models, outperforming sequence-level approaches in forgetting and utility preservation.",
      "url": "https://arxiv.org/abs/2605.00364",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00364v1 Announce Type: new Abstract: Machine unlearning has emerged as a critical capability for addressing privacy, safety, and regulatory concerns in large language models (LLMs). Existing methods operate at the sequence level, applying uniform updates across all tokens despite only a subset encoding the knowledge targeted for removal. This introduces gradient noise, degrades utility, and leads to suboptimal forgetting. We propose TokenUnlearn, a token-level attribution framework that identifies and selectively targets critical tokens. Our approach combines knowledge-aware signals via masking, and entropy-aware signals to yield importance scores for precise token selection. We develop two complementary strategies: hard selection, applying unlearning only to high-importance tokens, and soft weighting, modulating gradient contributions based on importance scores. Both extend existing methods to token-level variants. Theoretical analysis shows token-level selection improves gradient signal-to-noise ratio. Experiments on TOFU and WMDP benchmarks across three model architectures demonstrate consistent improvements over sequence-level baselines in both forgetting effectiveness and utility preservation.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:34:49Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00373",
      "anchor": "e53a881c35",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Language-free Experience at Expo 2025 Osaka",
      "headline": "Researchers developed multilingual translation and simultaneous interpretation technologies for Expo 2025 Osaka, emphasizing low-latency, high-quality translation using chunk-based segmentation and multi-engine approaches.",
      "url": "https://arxiv.org/abs/2605.00373",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00373v1 Announce Type: new Abstract: In line with the Global Communication Plan 2025, we have pursued the development of multilingual translation technologies to realize a language-barrier-free experience at Expo 2025 Osaka. Our work includes the advancement of simultaneous interpretation systems emphasizing high translation quality and low latency. Key achievements include chunk-based input segmentation, context-aware translation, and multi-engine machine translation technologies. Through demonstration deployments and collaboration with private companies, our technologies have led to real-world applications, with several services and systems showcased at Expo 2025 Osaka.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:34:57Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00383",
      "anchor": "112c438dae",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Agentic AI for Substance Use Education: Integrating Regulatory and Scientific Knowledge Sources",
      "headline": "An agentic AI system integrating DEA records and peer-reviewed literature for substance use education received high ratings from subject matter experts.",
      "url": "https://arxiv.org/abs/2605.00383",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00383v1 Announce Type: new Abstract: The delivery of traditional substance education has remained problematic due to challenges in scalability, personalization, and the currency of information in a rapidly evolving substance use landscape. While artificial intelligence (AI) offers a promising frontier for enhancing educational delivery, its application in providing real-time, authoritative substance use education remains largely underexplored. We built an agentic-based AI web application that combined Drug Enforcement Administration records with peer-reviewed literature in real-time to provide transparent context-sensitive substance use education. The system uses retrieval-augmented generation with a carefully filtered corpus of 102 documents and dynamic PubMed queries. Document storage was semantically chunked and placed in a vector representation in order to be easily retrieved. We conducted an expert evaluation study in which a panel of five subject matter experts generated 30 domain-specific questions, and two independent raters assessed 90 system interactions (30 primary questions plus two contextual follow-ups each) using a five-point Likert scale across four criteria: factual accuracy, citation quality, contextual coherence, and regulatory appropriateness. Mean ratings ranged from 4.18 to 4.35 across the four criteria (overall category range: 4.05-4.52), with substantial inter-rater agreement (Cohen's kappa = 0.78). These findings suggest that agentic AI architectures integrating authoritative regulatory sources with real-time scientific literature represent a promising direction for scalable, accurate, and verifiable health education delivery, warranting further evaluation through longitudinal user studies.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:35:08Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00410",
      "anchor": "7f9d67dc8f",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Agent Capsules: Quality-Gated Granularity Control for Multi-Agent LLM Pipelines",
      "headline": "Agent Capsules optimizes multi-agent LLM pipelines by adaptively selecting execution strategies within quality constraints, achieving 51% token savings.",
      "url": "https://arxiv.org/abs/2605.00410",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00410v1 Announce Type: new Abstract: A multi-agent pipeline with N agents typically issues N LLM calls per run. Merging agents into fewer calls (compound execution) promises token savings, but naively merged calls silently degrade quality through tool loss and prompt compression. We present Agent Capsules, an adaptive execution runtime that treats multi-agent pipeline execution as an optimization problem with empirical quality constraints. The runtime instruments coordination overhead per group, scores composition opportunity, selects among three compound execution strategies, and gates every mode switch on rolling-mean output quality. A controlled negative result confirms that injecting more context into a merged call worsens compression rather than relieving it, so the framework's escalation ladder (standard, then two-phase, then sequential) recovers quality by moving toward per-agent dispatch rather than by rewriting merged prompts. On LLM-judged quality, the controller matches a hand-tuned oracle on every measured (model, group, mode) cell: routing compound whenever the oracle would, and reverting to fine whenever quality would fail the floor, without per-model configuration. Against a hand-crafted LangGraph implementation of a 14-agent competitive intelligence pipeline, Agent Capsules uses 51% fewer fine-mode input tokens and 42% fewer compound-mode input tokens, at +0.020 and +0.017 quality respectively. Against a DSPy implementation of a 5-agent due diligence pipeline, the framework uses 19% fewer tokens than uncompiled DSPy at quality parity, and 68% fewer tokens than MIPROv2 at +0.052 quality. Even before compound mode fires, the runtime delivers efficiency through automatic policy resolution, cache-aligned prompts, and topology-aware context injection, matching both hand-tuned and compile-time baselines without training data or per-pipeline engineering.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:35:26Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00421",
      "anchor": "0eebba6ffd",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "RadLite: Multi-Task LoRA Fine-Tuning of Small Language Models for CPU-Deployable Radiology AI",
      "headline": "Small language models fine-tuned with LoRA achieve strong radiology performance and can be deployed on consumer CPUs.",
      "url": "https://arxiv.org/abs/2605.00421",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00421v1 Announce Type: new Abstract: Large language models (LLMs) show promise in radiology but their deployment is limited by computational requirements that preclude use in resource-constrained clinical environments. We investigate whether small language models (SLMs) of 3-4 billion parameters can achieve strong multi-task radiology performance through LoRA fine-tuning, enabling deployment on consumer-grade CPUs. We train Qwen2.5-3B-Instruct and Qwen3-4B on 162K samples spanning 9 radiology tasks - RADS classification across 10 systems, impression generation, temporal comparison, radiology NLI, NER, abnormality detection, N/M staging, and radiology Q&A - compiled from 12 public datasets. Both models are evaluated on up to 500 held-out test samples per task with standardized metrics. Our key findings are: (1) LoRA fine-tuning dramatically improves performance over zero-shot baselines (RADS accuracy +53%, NLI +60%, N-staging +89%); (2) the two models exhibit complementary strengths - Qwen2.5 excels at structured generation tasks while Qwen3 dominates extractive tasks; (3) a task-outed oracle ensemble combining both models achieves the best performance across all tasks; (4) few-shot prompting with fine-tuned models hurts performance, demonstrating that LoRA adaptation is more effective than in-context learning for specialized domains; and (5) models can be quantized to GGUF format (~1.8-2.4GB) for CPU deployment at 4-8 tokens/second on consumer hardware. Our work demonstrates that small, efficiently fine-tuned models - which we collectively call RadLite - can serve as practical multi-task radiology AI assistants deployable entirely on consumer hardware without GPU requirements.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:49:06Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00435",
      "anchor": "f951cfb660",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Escaping Mode Collapse in LLM Generation via Geometric Regulation",
      "headline": "Researchers propose Reinforced Mode Regulation, a low-rank intervention in Transformer value caches that reduces mode collapse in language model generation.",
      "url": "https://arxiv.org/abs/2605.00435",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00435v1 Announce Type: new Abstract: Mode collapse is a persistent challenge in generative modeling and appears in autoregressive text generation as behaviors ranging from explicit looping to gradual loss of diversity and premature trajectory convergence. We take a dynamical-systems view and reinterpret mode collapse as reduced state-space accessibility caused by *geometric collapse*: during generation, the model's internal trajectory becomes confined to a low-dimensional region of its representation space. This implies mode collapse is not purely a token-level phenomenon and cannot be reliably solved by symbolic constraints or probability-only decoding heuristics. Guided by this perspective, we propose *Reinforced Mode Regulation* (RMR), a lightweight, online state-space intervention that regulates dominant self-reinforcing directions in the Transformer value cache (implemented as low-rank damping). Across multiple large language models, RMR substantially reduces mode collapse and enables stable, high-quality generation at extremely low entropy rates (down to 0.8 nats/step), whereas standard decoding typically collapses near 2.0 nats/step.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:49:16Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00436",
      "anchor": "1c01a4bada",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "Impact of Task Phrasing on Presumptions in Large Language Models",
      "headline": "Research reveals LLMs form presumptions from task phrasing, hindering adaptability; neutral phrasing enables better logical reasoning.",
      "url": "https://arxiv.org/abs/2605.00436",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00436v1 Announce Type: new Abstract: Concerns with the safety and reliability of applying large-language models (LLMs) in unpredictable real-world applications motivate this study, which examines how task phrasing can lead to presumptions in LLMs, making it difficult for them to adapt when the task deviates from these assumptions. We investigated the impact of these presumptions on the performance of LLMs using the iterated prisoner's dilemma as a case study. Our experiments reveal that LLMs are susceptible to presumptions when making decisions even with reasoning steps. However, when the task phrasing was neutral, the models demonstrated logical reasoning without much presumptions. These findings highlight the importance of proper task phrasing to reduce the risk of presumptions in LLMs.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:49:25Z"
    },
    {
      "id": "arxiv-cs.CL|https://arxiv.org/abs/2605.00468",
      "anchor": "09bb5bee59",
      "source_id": "arxiv-cs.CL",
      "source_name": "arXiv cs.CL",
      "category": "paper",
      "category_emoji": "📄",
      "hashtags": [
        "paper",
        "arxiv"
      ],
      "title": "ReLay: Personalized LLM-Generated Plain-Language Summaries for Better Understanding, but at What Cost?",
      "headline": "LLM-personalized plain language summaries improve comprehension but risk reinforcing biases and introducing hallucinations.",
      "url": "https://arxiv.org/abs/2605.00468",
      "published_at": "2026-05-04T04:00:00Z",
      "discovered_at": "2026-05-04T04:03:05Z",
      "description": "arXiv:2605.00468v1 Announce Type: new Abstract: Plain Language Summaries (PLS) aim to make research accessible to lay readers, but they are typically written in a one-size-fits-all style that ignores differences in readers' information needs and comprehension. In health contexts, this limitation is particularly important because misunderstanding scientific information can affect real-world decisions. Large language models (LLMs) offer new opportunities for personalizing PLS, but it remains unclear whether personalization helps, which strategies are most effective, and how to balance personalization with safety. We introduce ReLay, a dataset of 300 participant--PLS pairs from 50 lay participants in both static (expert-written) and interactive (LLM-personalized) settings. ReLay includes user characteristics, health information needs, information-seeking behavior, comprehension outcomes, interaction logs, and quality ratings. We use ReLay to evaluate five LLMs across two personalization methods. Personalization improves comprehension and perceived quality, but it also raises the risk of reinforcing user biases and introducing hallucinations, revealing a trade-off between personalization and safety. These findings highlight the need for personalization methods that are both effective and trustworthy for diverse lay audiences.",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T05:49:36Z"
    },
    {
      "id": "hn|https://letsbuyspiritair.com/",
      "anchor": "9ee8ef25ee",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Let's Buy Spirit Air",
      "headline": "Let's Buy Spirit Air",
      "url": "https://letsbuyspiritair.com/",
      "published_at": "2026-05-03T23:36:11Z",
      "discovered_at": "2026-05-04T02:02:16Z",
      "description": "HN score 119 · 79 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://larsfaye.com/articles/agentic-coding-is-a-trap",
      "anchor": "eef3b1cc5c",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Agentic Coding Is a Trap",
      "headline": "Agentic Coding Is a Trap",
      "url": "https://larsfaye.com/articles/agentic-coding-is-a-trap",
      "published_at": "2026-05-03T22:52:07Z",
      "discovered_at": "2026-05-04T01:32:03Z",
      "description": "HN score 109 · 77 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://github.com/aattaran/deepclaude",
      "anchor": "96b2a7ee4c",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "DeepClaude – Claude Code agent loop with DeepSeek V4 Pro, 17x cheaper",
      "headline": "DeepClaude – Claude Code agent loop with DeepSeek V4 Pro, 17x cheaper",
      "url": "https://github.com/aattaran/deepclaude",
      "published_at": "2026-05-03T22:13:20Z",
      "discovered_at": "2026-05-04T00:16:20Z",
      "description": "HN score 110 · 46 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://www.lelanthran.com/chap15/content.html",
      "anchor": "a40ca5f84f",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "LLMs Are Not a Higher Level of Abstraction",
      "headline": "LLMs Are Not a Higher Level of Abstraction",
      "url": "https://www.lelanthran.com/chap15/content.html",
      "published_at": "2026-05-03T17:51:58Z",
      "discovered_at": "2026-05-04T05:48:57Z",
      "description": "HN score 102 · 98 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "simonw|https://simonwillison.net/2026/May/3/anthropic",
      "anchor": "6ba32ec045",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Quoting Anthropic",
      "headline": "Quoting Anthropic",
      "url": "https://simonwillison.net/2026/May/3/anthropic",
      "published_at": "2026-05-03T15:13:23Z",
      "discovered_at": "2026-05-03T15:27:18Z",
      "description": "<blockquote cite=\"https://www.anthropic.com/research/claude-personal-guidance\"><p>We used an automatic classifier which judged sycophancy by looking at whether Claude showed a willingness to push back, maintain positions when challenged, give praise proportional to the merit of ideas, and speak frankly regardless of what a person wants to hear. Most of the time in these situations, Claude expressed no sycophancy—only 9% of conversations included sycophantic behavior (Figure 2). But two domains were exceptions: we saw sycophantic behavior in 38% of conversations focused on spirituality, and 25% of conversations on relationships.</p></blockquote> <p class=\"cite\">&mdash; <a href=\"https://www.anthropic.com/research/claude-personal-guidance\">Anthropic</a>, How people ask Claude for personal guidance</p> <p>Tags: <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/ai-personality\">ai-personality</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "github|https://github.com/Hmbown/DeepSeek-TUI",
      "anchor": "c478540395",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "Hmbown/DeepSeek-TUI — Coding agent for DeepSeek models that runs in your terminal",
      "headline": "DeepSeek-TUI is a terminal-based coding agent for DeepSeek models.",
      "url": "https://github.com/Hmbown/DeepSeek-TUI",
      "published_at": "2026-05-03T10:39:58Z",
      "discovered_at": "2026-05-03T10:39:58Z",
      "description": "Coding agent for DeepSeek models that runs in your terminal",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-03T10:40:13Z"
    },
    {
      "id": "github|https://github.com/AIDC-AI/Pixelle-Video",
      "anchor": "e70cb5768f",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "AIDC-AI/Pixelle-Video — 🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine",
      "headline": "Pixelle-Video is an AI system for automated short video generation.",
      "url": "https://github.com/AIDC-AI/Pixelle-Video",
      "published_at": "2026-05-03T10:39:58Z",
      "discovered_at": "2026-05-03T10:39:58Z",
      "description": "🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-03T10:40:22Z"
    },
    {
      "id": "github|https://github.com/sponsors/czlonkowski",
      "anchor": "8800b69301",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "sponsors/czlonkowski — A MCP for Claude Desktop / Claude Code / Windsurf / Cursor to build n8n workflows for you",
      "headline": "MCP tool enables Claude Desktop, Claude Code, Windsurf, and Cursor to build n8n workflows.",
      "url": "https://github.com/sponsors/czlonkowski",
      "published_at": "2026-05-03T10:39:58Z",
      "discovered_at": "2026-05-03T10:39:58Z",
      "description": "A MCP for Claude Desktop / Claude Code / Windsurf / Cursor to build n8n workflows for you",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-03T10:40:29Z"
    },
    {
      "id": "hn|https://acai.sh/blog/specsmaxxing",
      "anchor": "5ae940d564",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Specsmaxxing – On overcoming AI psychosis, and why I write specs in YAML",
      "headline": "Specsmaxxing – On overcoming AI psychosis, and why I write specs in YAML",
      "url": "https://acai.sh/blog/specsmaxxing",
      "published_at": "2026-05-03T06:33:38Z",
      "discovered_at": "2026-05-03T09:24:19Z",
      "description": "HN score 101 · 81 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://thinkpol.ca/2026/04/30/an-open-weights-chinese-model-just-beat-claude-gpt-5-5-and-gemini-in-a-programming-challenge",
      "anchor": "c588b5d57d",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge",
      "headline": "Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge",
      "url": "https://thinkpol.ca/2026/04/30/an-open-weights-chinese-model-just-beat-claude-gpt-5-5-and-gemini-in-a-programming-challenge/",
      "published_at": "2026-05-03T04:05:28Z",
      "discovered_at": "2026-05-03T05:37:43Z",
      "description": "HN score 103 · 41 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://www.nytimes.com/2026/05/01/business/surveillance-pricing-groceries-maryland.html",
      "anchor": "cff7cad468",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Maryland Is First to Ban A.I.-Driven Price Increases in Grocery Stores",
      "headline": "Maryland Is First to Ban A.I.-Driven Price Increases in Grocery Stores",
      "url": "https://www.nytimes.com/2026/05/01/business/surveillance-pricing-groceries-maryland.html",
      "published_at": "2026-05-03T01:24:09Z",
      "discovered_at": "2026-05-03T05:52:49Z",
      "description": "HN score 106 · 51 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "huggingface|https://huggingface.co/SulphurAI/Sulphur-2-base",
      "anchor": "9fadfd5a68",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "SulphurAI/Sulphur-2-base · 114♥ · 332 dl",
      "headline": "SulphurAI released Sulphur-2-base, a text-to-video model on HuggingFace.",
      "url": "https://huggingface.co/SulphurAI/Sulphur-2-base",
      "published_at": "2026-05-03T00:33:24Z",
      "discovered_at": "2026-05-04T01:47:04Z",
      "description": "text-to-video",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-04T01:47:19Z"
    },
    {
      "id": "hn|https://www.theguardian.com/technology/2026/apr/30/ai-outperforms-doctors-in-harvard-trial-of-emergency-triage-diagnoses",
      "anchor": "38fb510b12",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "OpenAI's o1 correctly diagnosed 67% of ER patients vs. 50-55% by triage doctors",
      "headline": "OpenAI's o1 correctly diagnosed 67% of ER patients vs. 50-55% by triage doctors",
      "url": "https://www.theguardian.com/technology/2026/apr/30/ai-outperforms-doctors-in-harvard-trial-of-emergency-triage-diagnoses",
      "published_at": "2026-05-03T00:30:50Z",
      "discovered_at": "2026-05-03T20:14:33Z",
      "description": "HN score 102 · 32 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://www.mendral.com/blog/agent-harness-belongs-outside-sandbox",
      "anchor": "8f52e21d80",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "The agent harness belongs outside the sandbox",
      "headline": "The agent harness belongs outside the sandbox",
      "url": "https://www.mendral.com/blog/agent-harness-belongs-outside-sandbox",
      "published_at": "2026-05-02T21:21:27Z",
      "discovered_at": "2026-05-03T08:39:05Z",
      "description": "HN score 101 · 75 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://github.com/microsoft/vscode/pull/310226",
      "anchor": "f174637e31",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "VS Code inserting 'Co-Authored-by Copilot' into commits regardless of usage",
      "headline": "VS Code inserting 'Co-Authored-by Copilot' into commits regardless of usage",
      "url": "https://github.com/microsoft/vscode/pull/310226",
      "published_at": "2026-05-02T19:57:26Z",
      "discovered_at": "2026-05-02T21:18:53Z",
      "description": "HN score 104 · 50 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://arxiv.org/abs/2509.00462",
      "anchor": "d5e7dc81a3",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "LLMs consistently pick resumes they generate over ones by humans or other models",
      "headline": "LLMs consistently pick resumes they generate over ones by humans or other models",
      "url": "https://arxiv.org/abs/2509.00462",
      "published_at": "2026-05-02T15:28:13Z",
      "discovered_at": "2026-05-02T16:35:49Z",
      "description": "HN score 121 · 64 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/introducing-muse-spark-msl",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Introducing Muse Spark: Scaling Towards Personal Superintelligence",
      "headline": "Introducing Muse Spark: Scaling Towards Personal Superintelligence",
      "url": "https://ai.meta.com/blog/introducing-muse-spark-msl",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "076887d2c1"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/scaling-how-we-build-test-advanced-ai",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Scaling How We Build and Test Our Most Advanced AI",
      "headline": "Scaling How We Build and Test Our Most Advanced AI",
      "url": "https://ai.meta.com/blog/scaling-how-we-build-test-advanced-ai",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6b1e0e3218"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/alta-daily-fashion-app-segment-anything",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "How Alta Daily Uses Meta’s Segment Anything to Reimagine the Digital Closet",
      "headline": "How Alta Daily Uses Meta’s Segment Anything to Reimagine the Digital Closet",
      "url": "https://ai.meta.com/blog/alta-daily-fashion-app-segment-anything",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a0e7a68bf0"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/segment-anything-model-3",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "SAM 3.1: Faster and More Accessible Real-Time Video Detection and Tracking With Multiplexing and Global Reasoning",
      "headline": "SAM 3.1: Faster and More Accessible Real-Time Video Detection and Tracking With Multiplexing and Global Reasoning",
      "url": "https://ai.meta.com/blog/segment-anything-model-3",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a3e940c842"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/meta-mtia-scale-ai-chips-for-billions",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Four MTIA Chips in Two Years: Scaling AI Experiences for Billions",
      "headline": "Four MTIA Chips in Two Years: Scaling AI Experiences for Billions",
      "url": "https://ai.meta.com/blog/meta-mtia-scale-ai-chips-for-billions",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9d357beef0"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/tribe-v2-brain-predictive-foundation-model",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Introducing TRIBE v2: A Predictive Foundation Model Trained to Understand How the Human Brain Processes Complex Stimuli",
      "headline": "Introducing TRIBE v2: A Predictive Foundation Model Trained to Understand How the Human Brain Processes Complex Stimuli",
      "url": "https://ai.meta.com/blog/tribe-v2-brain-predictive-foundation-model",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3d312790c8"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/world-resources-institute-dino-canopy-height-maps-v2",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Mapping the World's Forests with Greater Precision: Introducing Canopy Height Maps v2",
      "headline": "Mapping the World's Forests with Greater Precision: Introducing Canopy Height Maps v2",
      "url": "https://ai.meta.com/blog/world-resources-institute-dino-canopy-height-maps-v2",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "72b447054a"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/upenn-dino-sam-helping-medical-triage",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "How DINO and SAM are Helping Modernize Essential Medical Triage Practices",
      "headline": "How DINO and SAM are Helping Modernize Essential Medical Triage Practices",
      "url": "https://ai.meta.com/blog/upenn-dino-sam-helping-medical-triage",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6e45d1ca92"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/forest-research-dino",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Reducing Government Costs and Increasing Access to Greenspaces in the United Kingdom with DINO",
      "headline": "Reducing Government Costs and Increasing Access to Greenspaces in the United Kingdom with DINO",
      "url": "https://ai.meta.com/blog/forest-research-dino",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "c639d546e6"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/orakl-oncology-dinov2-accelerating-cancer-treatment",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "How Orakl Oncology is using DINOv2 to accelerate cancer treatment discovery",
      "headline": "How Orakl Oncology is using DINOv2 to accelerate cancer treatment discovery",
      "url": "https://ai.meta.com/blog/orakl-oncology-dinov2-accelerating-cancer-treatment",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d1838235b7"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/usra-sam-flood-emergencies",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "The Universities Space Research Association Applies Segment Anything Model for Responding to Flood Emergencies",
      "headline": "The Universities Space Research Association Applies Segment Anything Model for Responding to Flood Emergencies",
      "url": "https://ai.meta.com/blog/usra-sam-flood-emergencies",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "97b8f94871"
    },
    {
      "id": "meta-ai|https://ai.meta.com/blog/sam-audio",
      "source_id": "meta-ai",
      "source_name": "Meta AI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "meta"
      ],
      "title": "Introducing SAM Audio: The First Unified Multimodal Model for Audio Separation",
      "headline": "Introducing SAM Audio: The First Unified Multimodal Model for Audio Separation",
      "url": "https://ai.meta.com/blog/sam-audio",
      "published_at": "2026-05-02T13:40:35Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a0438fdbe7"
    },
    {
      "id": "github|https://github.com/TauricResearch/TradingAgents",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "TauricResearch/TradingAgents — TradingAgents: Multi-Agents LLM Financial Trading Framework",
      "headline": "TradingAgents is a multi-agent LLM framework for financial trading applications.",
      "url": "https://github.com/TauricResearch/TradingAgents",
      "published_at": "2026-05-02T13:26:02Z",
      "description": "TradingAgents: Multi-Agents LLM Financial Trading Framework",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:26:14Z",
      "anchor": "8a6b842dda"
    },
    {
      "id": "github|https://github.com/ruvnet/ruflo",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "ruvnet/ruflo — 🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, distributed swarm intelligence, RAG integration, and native Claude Code / Codex Integration",
      "headline": "Ruflo provides agent orchestration for Claude, featuring multi-agent swarms, autonomous workflows, conversational AI, and RAG integration.",
      "url": "https://github.com/ruvnet/ruflo",
      "published_at": "2026-05-02T13:26:02Z",
      "description": "🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, distributed swarm intelligence, RAG integration, and native Claude Code / Codex Integration",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:26:24Z",
      "anchor": "3b68fbd3bd"
    },
    {
      "id": "github|https://github.com/browserbase/skills",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "browserbase/skills — Claude Agent SDK with a web browsing tool",
      "headline": "Browserbase/skills is a Claude Agent SDK with web browsing capabilities.",
      "url": "https://github.com/browserbase/skills",
      "published_at": "2026-05-02T13:26:02Z",
      "description": "Claude Agent SDK with a web browsing tool",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:26:43Z",
      "anchor": "f04988da9f"
    },
    {
      "id": "github|https://github.com/1jehuang/jcode",
      "source_id": "github",
      "source_name": "GitHub Trending",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "github"
      ],
      "title": "1jehuang/jcode — Coding Agent Harness",
      "headline": "jcode, a coding agent harness, is trending on GitHub.",
      "url": "https://github.com/1jehuang/jcode",
      "published_at": "2026-05-02T13:26:02Z",
      "description": "Coding Agent Harness",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:26:56Z",
      "anchor": "aefa5f1c89"
    },
    {
      "id": "hn|https://arxiv.org/abs/2406.11717",
      "anchor": "894eb29c2d",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Refusal in Language Models Is Mediated by a Single Direction",
      "headline": "Refusal in Language Models Is Mediated by a Single Direction",
      "url": "https://arxiv.org/abs/2406.11717",
      "published_at": "2026-05-02T13:15:23Z",
      "discovered_at": "2026-05-03T04:22:08Z",
      "description": "HN score 100 · 36 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://github.com/nexu-io/open-design",
      "anchor": "46f8ac71ee",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Open Design: Use Your Coding Agent as a Design Engine",
      "headline": "Open Design: Use Your Coding Agent as a Design Engine",
      "url": "https://github.com/nexu-io/open-design",
      "published_at": "2026-05-02T12:16:16Z",
      "description": "HN score 101 · 64 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-ai-engineer-worlds-fair-autoresearch",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] AI Engineer World's Fair — Autoresearch, Memory, World Models, Tokenmaxxing, Agentic Commerce, and Vertical AI Call for Speakers",
      "headline": "[AINews] AI Engineer World's Fair — Autoresearch, Memory, World Models, Tokenmaxxing, Agentic Commerce, and Vertical AI Call for Speakers",
      "url": "https://www.latent.space/p/ainews-ai-engineer-worlds-fair-autoresearch",
      "published_at": "2026-05-02T07:21:55Z",
      "description": "a quiet day lets us make a call for speakers!",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "90f39a5c01"
    },
    {
      "id": "hn|https://californiawaterblog.com/2026/04/26/ai-water-use-distractions-and-lessons-for-california",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "AI uses less water than the public thinks",
      "headline": "AI uses less water than the public thinks",
      "url": "https://californiawaterblog.com/2026/04/26/ai-water-use-distractions-and-lessons-for-california/",
      "published_at": "2026-05-01T17:18:58Z",
      "description": "HN score 379 · 356 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0a0d4c4c24"
    },
    {
      "id": "hn|https://github.com/Exocija/ZetaLib/blob/main/The%20Gay%20Jailbreak/The%20Gay%20Jailbreak.md",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "The gay jailbreak technique (2025)",
      "headline": "The gay jailbreak technique (2025)",
      "url": "https://github.com/Exocija/ZetaLib/blob/main/The%20Gay%20Jailbreak/The%20Gay%20Jailbreak.md",
      "published_at": "2026-05-01T16:59:35Z",
      "description": "HN score 544 · 226 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8681ab0f28"
    },
    {
      "id": "hn|https://simonwillison.net/2026/Apr/24/deepseek-v4",
      "anchor": "f30b1863a4",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "DeepSeek V4–almost on the frontier, a fraction of the price",
      "headline": "DeepSeek V4–almost on the frontier, a fraction of the price",
      "url": "https://simonwillison.net/2026/Apr/24/deepseek-v4/",
      "published_at": "2026-05-01T16:52:43Z",
      "discovered_at": "2026-05-02T18:17:24Z",
      "description": "HN score 384 · 247 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://www.bbc.com/news/articles/c5yerr4m1yno",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Spotify adds 'Verified' badges to distinguish human artists from AI",
      "headline": "Spotify adds 'Verified' badges to distinguish human artists from AI",
      "url": "https://www.bbc.com/news/articles/c5yerr4m1yno",
      "published_at": "2026-05-01T16:42:41Z",
      "description": "HN score 248 · 272 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "32e2d6239a"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/claude-for-creative-work",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Claude for Creative Work",
      "headline": "Claude for Creative Work",
      "url": "https://www.anthropic.com/news/claude-for-creative-work",
      "published_at": "2026-05-01T15:44:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2f8aa47e13"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/emotion-concepts-function",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Emotion concepts and their function in a large language model",
      "headline": "Emotion concepts and their function in a large language model",
      "url": "https://www.anthropic.com/research/emotion-concepts-function",
      "published_at": "2026-05-01T13:51:54Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "330be8002a"
    },
    {
      "id": "hn|https://github.com/intel/auto-round",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Advanced Quantization Algorithm for LLMs",
      "headline": "Advanced Quantization Algorithm for LLMs",
      "url": "https://github.com/intel/auto-round",
      "published_at": "2026-05-01T09:10:02Z",
      "description": "HN score 128 · 16 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "11d8e2b26c"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-agents-for-everything-else",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work",
      "headline": "[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work",
      "url": "https://www.latent.space/p/ainews-agents-for-everything-else",
      "published_at": "2026-05-01T04:53:41Z",
      "description": "a quiet day lets us reflect on coding agents \"breaking containment\"",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "04ea097051"
    },
    {
      "id": "hn|https://www.washingtonpost.com/health/2026/04/30/adhd-subtype-extreme-brain-scans",
      "anchor": "91dc300522",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Brain scans reveal 3 ADHD subtypes",
      "headline": "Brain scans reveal 3 ADHD subtypes",
      "url": "https://www.washingtonpost.com/health/2026/04/30/adhd-subtype-extreme-brain-scans/",
      "published_at": "2026-05-01T02:03:57Z",
      "discovered_at": "2026-05-04T05:03:37Z",
      "description": "HN score 100 · 92 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/claude-personal-guidance",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "How people ask Claude for personal guidance",
      "headline": "How people ask Claude for personal guidance",
      "url": "https://www.anthropic.com/research/claude-personal-guidance",
      "published_at": "2026-05-01T00:08:28Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "98d14057c4"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/30/codex-goals",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Codex CLI 0.128.0 adds /goal",
      "headline": "Codex CLI 0.128.0 adds /goal",
      "url": "https://simonwillison.net/2026/Apr/30/codex-goals",
      "published_at": "2026-04-30T23:23:17Z",
      "description": "<p><strong><a href=\"https://github.com/openai/codex/releases/tag/rust-v0.128.0\">Codex CLI 0.128.0 adds /goal</a></strong></p> The latest version of OpenAI's Codex CLI coding agent adds their own version of the <a href=\"https://ghuntley.com/ralph/\">Ralph loop</a>: you can now set a <code>/goal</code> and Codex will keep on looping until it evaluates that the goal has been completed... or the configured token budget has been exhausted.</p> <p>It looks like the feature is mainly implemented though the <a href=\"https://github.com/openai/codex/blob/6014b6679ffbd92eeddffa3ad7b4402be6a7fefe/codex-rs/core/templates/goals/continuation.md\">goals/continuation.md</a> and <a href=\"https://github.com/openai/codex/blob/6014b6679ffbd92eeddffa3ad7b4402be6a7fefe/codex-rs/core/templates/goals/budget_limit.md\">goals/budget_limit.md</a> prompts, which are automatically injected at the end of a turn. <p><small></small>Via <a href=\"https://twitter.com/fcoury/status/2049917871799636201\">@fcoury</a></small></p> <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/prompt-engineering\">prompt-engineering</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/coding-agents\">coding-agents</a>, <a href=\"https://simonwillison.net/tags/system-prompts\">system-prompts</a>, <a href=\"https://simonwillison.net/tags/codex-cli\">codex-cli</a>, <a href=\"https://simonwillison.net/tags/agentic-engineering\">agentic-engineering</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6fb4f3b2cf"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/30/gpt-55-cyber-capabilities",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Our evaluation of OpenAI's GPT-5.5 cyber capabilities",
      "headline": "Our evaluation of OpenAI's GPT-5.5 cyber capabilities",
      "url": "https://simonwillison.net/2026/Apr/30/gpt-55-cyber-capabilities",
      "published_at": "2026-04-30T23:03:24Z",
      "description": "<p><strong><a href=\"https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities\">Our evaluation of OpenAI&#x27;s GPT-5.5 cyber capabilities</a></strong></p> The UK's AI Security Institute <a href=\"https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities\">previously evaluated Claude Mythos</a>: now they've evaluated GPT-5.5 for finding security vulnerability and found it to be comparable to Mythos, but unlike Mythos it's generally available right now. <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/claude\">claude</a>, <a href=\"https://simonwillison.net/tags/ai-security-research\">ai-security-research</a>, <a href=\"https://simonwillison.net/tags/gpt\">gpt</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d416950edc"
    },
    {
      "id": "hn|https://med.stanford.edu/news/all-news/2026/04/brain-scans-individual-versus-group.html",
      "anchor": "15ca318fa2",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Group averages obscure how an individual's brain controls behavior: study",
      "headline": "Group averages obscure how an individual's brain controls behavior: study",
      "url": "https://med.stanford.edu/news/all-news/2026/04/brain-scans-individual-versus-group.html",
      "published_at": "2026-04-30T22:35:36Z",
      "discovered_at": "2026-05-03T19:59:23Z",
      "description": "HN score 100 · 27 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "anthropic|https://www.anthropic.com/engineering/april-23-postmortem",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "An update on recent Claude Code quality reports",
      "headline": "An update on recent Claude Code quality reports",
      "url": "https://www.anthropic.com/engineering/april-23-postmortem",
      "published_at": "2026-04-30T21:34:23Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "1dd900fd49"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/30/rss-vibe-coded-apps",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "We need RSS for sharing abundant vibe-coded apps",
      "headline": "We need RSS for sharing abundant vibe-coded apps",
      "url": "https://simonwillison.net/2026/Apr/30/rss-vibe-coded-apps",
      "published_at": "2026-04-30T18:38:48Z",
      "description": "<p><strong><a href=\"https://interconnected.org/home/2026/04/29/syndicating-vibes\">We need RSS for sharing abundant vibe-coded apps</a></strong></p> Matt Webb:</p> <blockquote> <p>I would love an RSS web feed for all those various tools and apps pages, each item with an “Install” button. (But install to where?)</p> <p>The lesson here is that when vibe-coding accelerates app development, apps become more personal, more situated, and more frequent. Shipping a tool or a micro-app is less like launching a website and more like posting on a blog.</p> </blockquote> <p>This inspired me to <a href=\"https://github.com/simonw/simonwillisonblog/pull/665\">have Claude</a> add an Atom feed (and icon) to my <a href=\"https://simonwillison.net/elsewhere/tool/\">/elsewhere/tools/</a> page, which itself is populated by content from my <a href=\"https://tools.simonwillison.net/\">tools.simonwillison.net</a> site. <p>Tags: <a href=\"https://simonwillison.net/tags/atom\">atom</a>, <a href=\"https://simonwillison.net/tags/matt-webb\">matt-webb</a>, <a href=\"https://simonwillison.net/tags/rss\">rss</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/vibe-coding\">vibe-coding</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a4c76200eb"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/the-long-term-benefit-trust",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "The Long-Term Benefit Trust",
      "headline": "The Long-Term Benefit Trust",
      "url": "https://www.anthropic.com/news/the-long-term-benefit-trust",
      "published_at": "2026-04-30T16:44:34Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b629f2119e"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/Evaluating-Claude-For-Bioinformatics-With-BioMysteryBench",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Evaluating Claude’s bioinformatics research capabilities with BioMysteryBench",
      "headline": "Evaluating Claude’s bioinformatics research capabilities with BioMysteryBench",
      "url": "https://www.anthropic.com/research/Evaluating-Claude-For-Bioinformatics-With-BioMysteryBench",
      "published_at": "2026-04-30T15:03:14Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d7e18dea2c"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/ai-co-clinician",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Enabling a new model for healthcare with AI co-clinician",
      "headline": "Enabling a new model for healthcare with AI co-clinician",
      "url": "https://deepmind.google/blog/ai-co-clinician",
      "published_at": "2026-04-30T12:14:15Z",
      "description": "Researching the path to AI-augmented care and development of an AI co-clinician.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3d07059dcf"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-the-inference-inflection",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] The Inference Inflection",
      "headline": "[AINews] The Inference Inflection",
      "url": "https://www.latent.space/p/ainews-the-inference-inflection",
      "published_at": "2026-04-30T01:42:51Z",
      "description": "a quiet day lets us reflect on the growing implications of the inference age",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0ec4117843"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/30/zig-anti-ai",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "The Zig project's rationale for their firm anti-AI contribution policy",
      "headline": "The Zig project's rationale for their firm anti-AI contribution policy",
      "url": "https://simonwillison.net/2026/Apr/30/zig-anti-ai",
      "published_at": "2026-04-30T01:24:23Z",
      "description": "<p><a href=\"https://ziglang.org/\">Zig</a> has one of the most stringent <a href=\"https://ziglang.org/code-of-conduct/\">anti-LLM policies</a> of any major open source project:</p> <blockquote> <p>No LLMs for issues.</p> <p>No LLMs for pull requests.</p> <p>No LLMs for comments on the bug tracker, including translation. English is encouraged, but not required. You are welcome to post in your native language and rely on others to have their own translation tools of choice to interpret your words.</p> </blockquote> <p>The most prominent project written in Zig may be the <a href=\"https://bun.com/\">Bun</a> JavaScript runtime, which was <a href=\"https://bun.com/blog/bun-joins-anthropic\">acquired by Anthropic</a> in December 2025 and, unsurprisingly, makes heavy use of AI assistance.</p> <p>Bun operates its own fork of Zig, and recently <a href=\"https://x.com/bunjavascript/status/2048427636414923250\">achieved a 4x performance improvement</a> on Bun compile after adding \"parallel semantic analysis and multiple codegen units to the llvm backend\". Here's <a href=\"https://github.com/oven-sh/zig/compare/upgrade-0.15.2%E2%80%A6upgrade-0.15.2-fast\">that code</a>. But <a href=\"https://twitter.com/bunjavascript/status/2048428104893542781\">@bunjavascript says</a>:</p> <blockquote> <p>We do not currently plan to upstream this, as Zig has a strict ban on LLM-authored contributions.</p> </blockquote> <p>(Update: here's <a href=\"https://ziggit.dev/t/bun-s-zig-fork-got-4x-faster-compilation-times/15183/19\">a Zig core contributor</a> providing details on why they wouldn't accept that particular patch independent of the LLM issue - parallel semantic analysis is a long planned feature but has implications \"for the Zig language itself\".)</p> <p>In <a href=\"https://kristoff.it/blog/contributor-poker-and-ai/\">Contributor Poker and Zig's AI Ban</a> (<a href=\"https://lobste.rs/s/ifcyr1/contributor_poker_zig_s_ai_ban\">via Lobste.rs</a>) Zig Software Foundation VP of Community Loris Cro explains the rationale for this strict ban. It's the best articulation I've seen yet for a blanket ban on LLM-assisted contributions:</p> <blockquote> <p>In successful open source projects you eventually reach a point where you start getting more PRs than what you’re capable of processing. Given what I mentioned so far, it would make sense to stop accepting imperfect PRs in order to maximize ROI from your work, but that’s not what we do in the Zig project. Instead, <strong>we try our best to help new contributors to get their work in, even if they need some help getting there</strong>. We don’t do this just because it’s the “right” thing to do, but also <strong>because it’s the smart thing to do</strong>.</p> </blockquote> <p>Zig values contributors over their contributions. Each contributor represents an investment by the Zig core team - the primary goal of reviewing and accepting PRs isn't to land new code, it's to help grow new contributors who can become trusted and prolific over time.</p> <p>LLM assistance breaks that completely. It doesn't matter if the LLM helps you submit a <em>perfect</em> PR to Zig - the time the Zig team spends reviewing your work does nothing to help them add new, confident, trustworthy contributors to their overall project.</p> <p>Loris explains the name here:</p> <blockquote> <p>The reason I call it “contributor poker” is because, just like people say about the actual card game, “you play the person, not the cards”. In contributor poker, you bet on the contributor, not on the contents of their first PR.</p> </blockquote> <p>This makes a lot of sense to me. It relates to an idea I've seen circulating elsewhere: if a PR was mostly written by an LLM, why should a project maintainer spend time reviewing and discussing that PR as opposed to firing up their own LLM to solve the same problem?</p> <p>Tags: <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/zig\">zig</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/open-source\">open-source</a>, <a href=\"https://simonwillison.net/tags/javascript\">javascript</a>, <a href=\"https://simonwillison.net/tags/ai-assisted-programming\">ai-assisted-programming</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/bun\">bun</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8a82e6c161"
    },
    {
      "id": "openai|https://openai.com/index/advanced-account-security",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Introducing Advanced Account Security",
      "headline": "Introducing Advanced Account Security",
      "url": "https://openai.com/index/advanced-account-security",
      "published_at": "2026-04-30T00:00:00Z",
      "description": "Introducing Advanced Account Security: phishing-resistant login, stronger recovery, and enhanced protections to safeguard sensitive data and prevent account takeover.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "75734e21aa"
    },
    {
      "id": "xai|https://x.ai/news/grok-custom-voices",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Custom Voices and Voice Library",
      "headline": "Custom Voices and Voice Library",
      "url": "https://x.ai/news/grok-custom-voices",
      "published_at": "2026-04-30T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "230a2bb9cd"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/29/llm-3",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "llm 0.32a1",
      "headline": "llm 0.32a1",
      "url": "https://simonwillison.net/2026/Apr/29/llm-3",
      "published_at": "2026-04-29T23:52:50Z",
      "description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/llm/releases/tag/0.32a1\">llm 0.32a1</a></p> <blockquote> <ul> <li>Fixed a bug in 0.32a0 where tool-calling conversations were not correctly reinflated from SQLite. <a href=\"https://github.com/simonw/llm/issues/1426\">#1426</a></li> </ul> </blockquote> <p>Tags: <a href=\"https://simonwillison.net/tags/llm\">llm</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "64bb36ce40"
    },
    {
      "id": "hn|https://www.wired.com/story/when-robots-have-their-chatgpt-moment-remember-these-pincers",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Eka’s robotic claw feels like we're approaching a ChatGPT moment",
      "headline": "Eka’s robotic claw feels like we're approaching a ChatGPT moment",
      "url": "https://www.wired.com/story/when-robots-have-their-chatgpt-moment-remember-these-pincers/",
      "published_at": "2026-04-29T22:56:10Z",
      "description": "HN score 153 · 216 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a762bd9536"
    },
    {
      "id": "mistral|https://mistral.ai/news/la-plateforme",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "La Plateforme",
      "headline": "La Plateforme",
      "url": "https://mistral.ai/news/la-plateforme",
      "published_at": "2026-04-29T21:56:12Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "625afc586d"
    },
    {
      "id": "openai|https://openai.com/index/where-the-goblins-came-from",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Where the goblins came from",
      "headline": "Where the goblins came from",
      "url": "https://openai.com/index/where-the-goblins-came-from",
      "published_at": "2026-04-29T20:00:00Z",
      "description": "How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "c4c59a68f9"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/29/llm",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "LLM 0.32a0 is a major backwards-compatible refactor",
      "headline": "LLM 0.32a0 is a major backwards-compatible refactor",
      "url": "https://simonwillison.net/2026/Apr/29/llm",
      "published_at": "2026-04-29T19:01:47Z",
      "description": "<p>I just released <a href=\"https://llm.datasette.io/en/latest/changelog.html#a0-2026-04-28\">LLM 0.32a0</a>, an alpha release of my <a href=\"https://llm.datasette.io/\">LLM</a> Python library and CLI tool for accessing LLMs, with some consequential changes that I've been working towards for quite a while.</p> <p>Previous versions of LLM modeled the world in terms of prompts and responses. Send the model a text prompt, get back a text response.</p> <pre><span class=\"pl-k\">import</span> <span class=\"pl-s1\">llm</span> <span class=\"pl-s1\">model</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">llm</span>.<span class=\"pl-c1\">get_model</span>(<span class=\"pl-s\">\"gpt-5.5\"</span>) <span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">prompt</span>(<span class=\"pl-s\">\"Capital of France?\"</span>) <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">text</span>())</pre> <p>This made sense when I started working on the library back in April 2023. A lot has changed since then!</p> <p>LLM provides an abstraction over thousands of different models via its <a href=\"https://llm.datasette.io/en/stable/plugins/index.html\">plugin system</a>. The original abstraction - of text input that returns text output - was no longer able to represent everything I needed it to.</p> <p>Over time LLM itself has grown <a href=\"https://simonwillison.net/2024/Oct/29/llm-multi-modal/\">attachments</a> to handle image, audio, and video input, then <a href=\"https://simonwillison.net/2025/Feb/28/llm-schemas/\">schemas</a> for outputting structured JSON, then <a href=\"https://simonwillison.net/2025/May/27/llm-tools/\">tools</a> for executing tool calls. Meanwhile LLMs kept evolving, adding reasoning support and the ability to return images and all kinds of other interesting capabilities.</p> <p>LLM needs to evolve to better handle the diversity of input and output types that can be processed by today's frontier models.</p> <p>The 0.32a0 alpha has two key changes: model inputs can be represented as a sequence of messages, and model responses can be composed of a stream of differently typed parts.</p> <h4 id=\"prompts-as-a-sequence-of-messages\">Prompts as a sequence of messages</h4> <p>LLMs accept input as text, but ever since ChatGPT demonstrated the value of a two-way conversational interface, the most common way to prompt them has been to treat that input as a sequence of conversational turns.</p> <p>The first turn might look like this:</p> <pre><code>user: Capital of France? assistant: </code></pre> <p>(The model then gets to fill out the reply from the assistant.)</p> <p>But each subsequent turn needs to replay the entire conversation up to that point, as a sort of screenplay:</p> <pre><code>user: Capital of France? assistant: Paris user: Germany? assistant: </code></pre> <p>Most of the JSON APIs from the major vendors follow this pattern. Here's what the above looks like using the OpenAI chat completions API, which has been widely imitated by other providers:</p> <div class=\"highlight highlight-source-shell\"><pre>curl https://api.openai.com/v1/chat/completions \\ -H <span class=\"pl-s\"><span class=\"pl-pds\">\"</span>Authorization: Bearer <span class=\"pl-smi\">$OPENAI_API_KEY</span><span class=\"pl-pds\">\"</span></span> \\ -H <span class=\"pl-s\"><span class=\"pl-pds\">\"</span>Content-Type: application/json<span class=\"pl-pds\">\"</span></span> \\ -d <span class=\"pl-s\"><span class=\"pl-pds\">'</span>{</span> <span class=\"pl-s\"> \"model\": \"gpt-5.5\",</span> <span class=\"pl-s\"> \"messages\": [</span> <span class=\"pl-s\"> {</span> <span class=\"pl-s\"> \"role\": \"user\",</span> <span class=\"pl-s\"> \"content\": \"Capital of France?\"</span> <span class=\"pl-s\"> },</span> <span class=\"pl-s\"> {</span> <span class=\"pl-s\"> \"role\": \"assistant\",</span> <span class=\"pl-s\"> \"content\": \"Paris\"</span> <span class=\"pl-s\"> },</span> <span class=\"pl-s\"> {</span> <span class=\"pl-s\"> \"role\": \"user\",</span> <span class=\"pl-s\"> \"content\": \"Germany?\"</span> <span class=\"pl-s\"> }</span> <span class=\"pl-s\"> ]</span> <span class=\"pl-s\"> }<span class=\"pl-pds\">'</span></span></pre></div> <p>Prior to 0.32, LLM modeled these as conversations:</p> <pre><span class=\"pl-s1\">model</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">llm</span>.<span class=\"pl-c1\">get_model</span>(<span class=\"pl-s\">\"gpt-5.5\"</span>) <span class=\"pl-s1\">conversation</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">conversation</span>() <span class=\"pl-s1\">r1</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">conversation</span>.<span class=\"pl-c1\">prompt</span>(<span class=\"pl-s\">\"Capital of France?\"</span>) <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">r1</span>.<span class=\"pl-c1\">text</span>()) <span class=\"pl-c\"># Outputs \"Paris\"</span> <span class=\"pl-s1\">r2</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">conversation</span>.<span class=\"pl-c1\">prompt</span>(<span class=\"pl-s\">\"Germany?\"</span>) <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">r2</span>.<span class=\"pl-c1\">text</span>()) <span class=\"pl-c\"># Outputs \"Berlin\"</span></pre> <p>This worked if you were building a conversation with the model from scratch, but it didn't provide a way to feed in a previous conversation from the start. This made tasks like building an emulation of the OpenAI chat completions API much harder than they should have been.</p> <p>The <code>llm</code> CLI tool worked around this through a custom mechanism for persisting and inflating conversations using SQLite, but that never became a stable part of the LLM API - and there are many places you might want to use the Python library without committing to SQLite as the storage layer.</p> <p>The new alpha now supports this:</p> <pre><span class=\"pl-k\">import</span> <span class=\"pl-s1\">llm</span> <span class=\"pl-k\">from</span> <span class=\"pl-s1\">llm</span> <span class=\"pl-k\">import</span> <span class=\"pl-s1\">user</span>, <span class=\"pl-s1\">assistant</span> <span class=\"pl-s1\">model</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">llm</span>.<span class=\"pl-c1\">get_model</span>(<span class=\"pl-s\">\"gpt-5.5\"</span>) <span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">prompt</span>(<span class=\"pl-s1\">messages</span><span class=\"pl-c1\">=</span>[ <span class=\"pl-en\">user</span>(<span class=\"pl-s\">\"Capital of France?\"</span>), <span class=\"pl-en\">assistant</span>(<span class=\"pl-s\">\"Paris\"</span>), <span class=\"pl-en\">user</span>(<span class=\"pl-s\">\"Germany?\"</span>), ]) <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">text</span>())</pre> <p>The <code>llm.user()</code> and <code>llm.assistant()</code> functions are new builder functions designed to be used within that <code>messages=[]</code> array.</p> <p>The previous <code>prompt=</code> option still works, but LLM upgrades it to a single-item messages array behind the scenes.</p> <p>You can also now <em>reply</em> to a response, as an alternative to building a conversation:</p> <pre><span class=\"pl-s1\">response2</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">reply</span>(<span class=\"pl-s\">\"How about Hungary?\"</span>) <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">response2</span>) <span class=\"pl-c\"># Default __str__() calls .text()</span></pre> <h4 id=\"streaming-parts\">Streaming parts</h4> <p>The other major new interface in the alpha concerns streaming results back from a prompt.</p> <p>Previously, LLM supported streaming like this:</p> <pre><span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">prompt</span>(<span class=\"pl-s\">\"Generate an SVG of a pelican riding a bicycle\"</span>) <span class=\"pl-k\">for</span> <span class=\"pl-s1\">chunk</span> <span class=\"pl-c1\">in</span> <span class=\"pl-s1\">response</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">chunk</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>)</pre> <p>Or this async variant:</p> <pre><span class=\"pl-k\">import</span> <span class=\"pl-s1\">asyncio</span> <span class=\"pl-k\">import</span> <span class=\"pl-s1\">llm</span> <span class=\"pl-s1\">model</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">llm</span>.<span class=\"pl-c1\">get_async_model</span>(<span class=\"pl-s\">\"gpt-5.5\"</span>) <span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">prompt</span>(<span class=\"pl-s\">\"Generate an SVG of a pelican riding a bicycle\"</span>) <span class=\"pl-k\">async</span> <span class=\"pl-k\">def</span> <span class=\"pl-en\">run</span>(): <span class=\"pl-k\">async</span> <span class=\"pl-k\">for</span> <span class=\"pl-s1\">chunk</span> <span class=\"pl-c1\">in</span> <span class=\"pl-s1\">response</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">chunk</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-s1\">asyncio</span>.<span class=\"pl-c1\">run</span>(<span class=\"pl-en\">run</span>())</pre> <p>Many of today's models return mixed types of content. A prompt run against Claude might return reasoning output, then text, then a JSON request for a tool call, then more text content.</p> <p>Some models can even execute tools on the server-side, for example OpenAI's <a href=\"https://developers.openai.com/api/docs/guides/tools-code-interpreter?lang=curl\">code interpreter tool</a> or Anthropic's <a href=\"https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool\">web search</a>. This means the results from the model can combine text, tool calls, tool outputs and other formats.</p> <p>Multi-modal output models are starting to emerge too, which can return images or even <a href=\"https://developers.openai.com/api/docs/guides/audio#add-audio-to-your-existing-application\">snippets of audio</a> intermixed into that streaming response.</p> <p>The new LLM alpha models these as a stream of typed message parts. Here's what that looks like as a Python API consumer:</p> <pre><span class=\"pl-k\">import</span> <span class=\"pl-s1\">asyncio</span> <span class=\"pl-k\">import</span> <span class=\"pl-s1\">llm</span> <span class=\"pl-s1\">model</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">llm</span>.<span class=\"pl-c1\">get_model</span>(<span class=\"pl-s\">\"gpt-5.5\"</span>) <span class=\"pl-s1\">prompt</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s\">\"invent 3 cool dogs, first talk about your motivations\"</span> <span class=\"pl-k\">def</span> <span class=\"pl-en\">describe_dog</span>(<span class=\"pl-s1\">name</span>: <span class=\"pl-smi\">str</span>, <span class=\"pl-s1\">bio</span>: <span class=\"pl-smi\">str</span>) <span class=\"pl-c1\">-&gt;</span> <span class=\"pl-smi\">str</span>: <span class=\"pl-s\">\"\"\"Record the name and biography of a hypothetical dog.\"\"\"</span> <span class=\"pl-k\">return</span> <span class=\"pl-s\">f\"<span class=\"pl-s1\"><span class=\"pl-kos\">{</span><span class=\"pl-s1\">name</span><span class=\"pl-kos\">}</span></span>: <span class=\"pl-s1\"><span class=\"pl-kos\">{</span><span class=\"pl-s1\">bio</span><span class=\"pl-kos\">}</span></span>\"</span> <span class=\"pl-k\">def</span> <span class=\"pl-en\">sync_example</span>(): <span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">prompt</span>( <span class=\"pl-s1\">prompt</span>, <span class=\"pl-s1\">tools</span><span class=\"pl-c1\">=</span>[<span class=\"pl-s1\">describe_dog</span>], ) <span class=\"pl-k\">for</span> <span class=\"pl-s1\">event</span> <span class=\"pl-c1\">in</span> <span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">stream_events</span>(): <span class=\"pl-k\">if</span> <span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">type</span> <span class=\"pl-c1\">==</span> <span class=\"pl-s\">\"text\"</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">chunk</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-k\">elif</span> <span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">type</span> <span class=\"pl-c1\">==</span> <span class=\"pl-s\">\"tool_call_name\"</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s\">f\"<span class=\"pl-cce\">\\n</span>Tool call: <span class=\"pl-s1\"><span class=\"pl-kos\">{</span><span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">chunk</span><span class=\"pl-kos\">}</span></span>(\"</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-k\">elif</span> <span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">type</span> <span class=\"pl-c1\">==</span> <span class=\"pl-s\">\"tool_call_args\"</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">chunk</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-k\">async</span> <span class=\"pl-k\">def</span> <span class=\"pl-en\">async_example</span>(): <span class=\"pl-s1\">model</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">llm</span>.<span class=\"pl-c1\">get_async_model</span>(<span class=\"pl-s\">\"gpt-5.5\"</span>) <span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">model</span>.<span class=\"pl-c1\">prompt</span>( <span class=\"pl-s1\">prompt</span>, <span class=\"pl-s1\">tools</span><span class=\"pl-c1\">=</span>[<span class=\"pl-s1\">describe_dog</span>], ) <span class=\"pl-k\">async</span> <span class=\"pl-k\">for</span> <span class=\"pl-s1\">event</span> <span class=\"pl-c1\">in</span> <span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">astream_events</span>(): <span class=\"pl-k\">if</span> <span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">type</span> <span class=\"pl-c1\">==</span> <span class=\"pl-s\">\"text\"</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">chunk</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-k\">elif</span> <span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">type</span> <span class=\"pl-c1\">==</span> <span class=\"pl-s\">\"tool_call_name\"</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s\">f\"<span class=\"pl-cce\">\\n</span>Tool call: <span class=\"pl-s1\"><span class=\"pl-kos\">{</span><span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">chunk</span><span class=\"pl-kos\">}</span></span>(\"</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-k\">elif</span> <span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">type</span> <span class=\"pl-c1\">==</span> <span class=\"pl-s\">\"tool_call_args\"</span>: <span class=\"pl-en\">print</span>(<span class=\"pl-s1\">event</span>.<span class=\"pl-c1\">chunk</span>, <span class=\"pl-s1\">end</span><span class=\"pl-c1\">=</span><span class=\"pl-s\">\"\"</span>, <span class=\"pl-s1\">flush</span><span class=\"pl-c1\">=</span><span class=\"pl-c1\">True</span>) <span class=\"pl-en\">sync_example</span>() <span class=\"pl-s1\">asyncio</span>.<span class=\"pl-c1\">run</span>(<span class=\"pl-en\">async_example</span>())</pre> <p>Sample output (from just the first sync example):</p> <blockquote> <p><code>My motivation: create three memorable dogs with distinct “cool” styles—one cinematic, one adventurous, and one charmingly chaotic—so each feels like they could star in their own story.</code><br /> <code>Tool call: describe_dog({\"name\": \"Nova Jetpaw\", \"bio\": \"A sleek silver-gray whippet who wears tiny aviator goggles and loves sprinting along moonlit beaches. Nova is fearless, elegant, and rumored to outrun drones just for fun.\"}</code><br /> <code>Tool call: describe_dog({\"name\": \"Mochi Thunderbark\", \"bio\": \"A fluffy corgi with a dramatic black-and-gold bandana and the confidence of a rock star. Mochi is short, loud, loyal, and leads a neighborhood 'security patrol' made entirely of squirrels.\"}</code><br /> <code>Tool call: describe_dog({\"name\": \"Atlas Snowfang\", \"bio\": \"A massive white husky with ice-blue eyes and a backpack full of trail snacks. Atlas is calm, heroic, and always knows the way home—even during blizzards, fog, or confusing camping trips.\"}</code></p> </blockquote> <p>At the end of the response you can call <code>response.execute_tool_calls()</code> to actually run the functions that were requested, or send a <code>response.reply()</code> to have those tools called and their return values sent back to the model:</p> <pre><span class=\"pl-en\">print</span>(<span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">reply</span>(<span class=\"pl-s\">\"Tell me about the dogs\"</span>))</pre> <p>This new mechanism for streaming different token types means the CLI tool can now display \"thinking\" text in a different color from the text in the final response. The thinking text goes to stderr so it won't affect results that are piped into other tools.</p> <p>This example uses Claude Sonnet 4.6 (with an updated streaming event version of the <a href=\"https://github.com/simonw/llm-anthropic\">llm-anthropic</a> plugin) as Anthropic's models return their reasoning text as part of the response:</p> <div class=\"highlight highlight-source-shell\"><pre>llm -m claude-sonnet-4.6 <span class=\"pl-s\"><span class=\"pl-pds\">'</span>Think about 3 cool dogs then describe them<span class=\"pl-pds\">'</span></span> \\ -o thinking_display 1</pre></div> <p><img src=\"https://static.simonwillison.net/static/2026/claude-thinking-llm.gif\" alt=\"Animated demo. Starts with ~/dev/scratch/llm-anthropic % uv run llm -m claude-sonnet-4.6 'Think about 3 cool dogs then describe them' -o thinking_display 1 - the text then streams in grey: The user wants me to think about 3 cool dogs and then describe them. Let me come up with 3 interesting, cool dogs and describe them. Then switches to regular color text for the output that describes the dogs.\" style=\"max-width: 100%;\" /></p> <p>You can suppress the output of reasoning tokens using the new <code>-R/--no-reasoning</code> flag. Surprisingly that ended up being the only CLI-facing change in this release.</p> <h4 id=\"a-mechanism-for-serializing-and-deserializing-responses\">A mechanism for serializing and deserializing responses</h4> <p>As mentioned earlier, LLM has quite inflexible code at the moment for persisting conversations to SQLite. I've added a new mechanism in 0.32a0 that should provide Python API users a way to roll their own alternative:</p> <pre><span class=\"pl-s1\">serializable</span> <span class=\"pl-c1\">=</span> <span class=\"pl-s1\">response</span>.<span class=\"pl-c1\">to_dict</span>() <span class=\"pl-c\"># serializable is a JSON-style dictionary</span> <span class=\"pl-c\"># store it anywhere you like, then inflate it:</span> <span class=\"pl-s1\">response</span> <span class=\"pl-c1\">=</span> <span class=\"pl-v\">Response</span>.<span class=\"pl-c1\">from_dict</span>(<span class=\"pl-s1\">serializable</span>)</pre> <p>The dictionary this returns is actually a <code>TypedDict</code> defined in the new <a href=\"https://github.com/simonw/llm/blob/main/llm/serialization.py\">llm/serialization.py</a> module.</p> <h4 id=\"what-s-next-\">What's next?</h4> <p>I'm releasing this as an alpha so I can upgrade various plugins and exercise the new design in real world environments for a few days. I expect the stable 0.32 release will be very similar to this alpha, unless alpha testing reveals some design flaw in the way I've put this all together.</p> <p>There's one remaining large task: I'd like to redesign the SQLite logging system to better capture the more finely grained details that are returned by this new abstraction.</p> <p>Ideally I'd like to model this as a graph, to best support situations like an OpenAI-style chat completions API where the same conversations are constantly extended and then repeated with every prompt. I want to be able to store those without duplicating them in the database.</p> <p>I'm undecided as to whether that should be a feature in 0.32 or I should hold it for 0.33.</p> <p>Tags: <a href=\"https://simonwillison.net/tags/projects\">projects</a>, <a href=\"https://simonwillison.net/tags/python\">python</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/annotated-release-notes\">annotated-release-notes</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm\">llm</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2e78d460c9"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/29/llm-2",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "llm 0.32a0",
      "headline": "llm 0.32a0",
      "url": "https://simonwillison.net/2026/Apr/29/llm-2",
      "published_at": "2026-04-29T18:57:47Z",
      "description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/llm/releases/tag/0.32a0\">llm 0.32a0</a></p> <p>See <a href=\"https://simonwillison.net/2026/Apr/29/llm/\">the annotated release notes</a>.</p> <p>Tags: <a href=\"https://simonwillison.net/tags/llm\">llm</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9f4d4bc559"
    },
    {
      "id": "openai|https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Building the compute infrastructure for the Intelligence Age",
      "headline": "Building the compute infrastructure for the Intelligence Age",
      "url": "https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age",
      "published_at": "2026-04-29T15:00:00Z",
      "description": "OpenAI scales Stargate to build the compute infrastructure powering AGI, adding new data center capacity to meet growing AI demand.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "26fe19711b"
    },
    {
      "id": "hn|https://github.com/bruin-data/dac",
      "anchor": "618d03c43a",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "Show HN: DAC – open-source dashboard as code tool for agents and humans",
      "headline": "Show HN: DAC – open-source dashboard as code tool for agents and humans",
      "url": "https://github.com/bruin-data/dac",
      "published_at": "2026-04-29T14:37:20Z",
      "discovered_at": "2026-05-03T03:36:48Z",
      "description": "HN score 100 · 31 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "mistral|https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Remote agents in Vibe. Powered by Mistral Medium 3.5.",
      "headline": "Remote agents in Vibe. Powered by Mistral Medium 3.5.",
      "url": "https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5",
      "published_at": "2026-04-29T14:31:11Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ecbe6e6a57"
    },
    {
      "id": "mistral|https://mistral.ai/news/workflows",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Workflows for work that runs the business",
      "headline": "Workflows for work that runs the business",
      "url": "https://mistral.ai/news/workflows",
      "published_at": "2026-04-29T13:49:09Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b62a85bd98"
    },
    {
      "id": "openai|https://openai.com/index/cybersecurity-in-the-intelligence-age",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Cybersecurity in the Intelligence Age",
      "headline": "Cybersecurity in the Intelligence Age",
      "url": "https://openai.com/index/cybersecurity-in-the-intelligence-age",
      "published_at": "2026-04-29T04:00:00Z",
      "description": "OpenAI outlines a five-part action plan for strengthening cybersecurity in the Intelligence Age, focused on democratizing AI-powered cyber defense and protecting critical systems.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8d6e1095ec"
    },
    {
      "id": "huggingface|https://huggingface.co/inclusionAI/Ling-2.6-1T",
      "anchor": "44ece2849f",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "inclusionAI/Ling-2.6-1T · 194♥ · 747 dl",
      "headline": "inclusionAI/Ling-2.6-1T · 194♥ · 747 dl",
      "url": "https://huggingface.co/inclusionAI/Ling-2.6-1T",
      "published_at": "2026-04-29T03:19:36Z",
      "discovered_at": "2026-05-04T08:04:56Z",
      "description": "text-generation",
      "summarize": true,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "hn|https://www.scd31.com/posts/diy-hair-electrolysis-machine",
      "anchor": "74906f0794",
      "source_id": "hn",
      "source_name": "Hacker News",
      "category": "community",
      "category_emoji": "🔥",
      "hashtags": [
        "community",
        "hn"
      ],
      "title": "I built my own hair electrolysis machine",
      "headline": "I built my own hair electrolysis machine",
      "url": "https://www.scd31.com/posts/diy-hair-electrolysis-machine",
      "published_at": "2026-04-29T02:42:46Z",
      "discovered_at": "2026-05-03T20:44:47Z",
      "description": "HN score 103 · 19 comments",
      "summarize": false,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-not-much-happened-today",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] not much happened today",
      "headline": "[AINews] not much happened today",
      "url": "https://www.latent.space/p/ainews-not-much-happened-today",
      "published_at": "2026-04-29T01:46:59Z",
      "description": "a quiet day.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2d081f0501"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/28/openai-codex",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Quoting OpenAI Codex base_instructions",
      "headline": "Quoting OpenAI Codex base_instructions",
      "url": "https://simonwillison.net/2026/Apr/28/openai-codex",
      "published_at": "2026-04-28T22:02:53Z",
      "description": "<blockquote cite=\"https://github.com/openai/codex/blob/66b0781502be5de3b1909525c987643b9e5e407d/codex-rs/models-manager/models.json#L55\"><p><code>Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query.</code></p></blockquote> <p class=\"cite\">&mdash; <a href=\"https://github.com/openai/codex/blob/66b0781502be5de3b1909525c987643b9e5e407d/codex-rs/models-manager/models.json#L55\">OpenAI Codex base_instructions</a>, for GPT-5.5</p> <p>Tags: <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/system-prompts\">system-prompts</a>, <a href=\"https://simonwillison.net/tags/prompt-engineering\">prompt-engineering</a>, <a href=\"https://simonwillison.net/tags/codex-cli\">codex-cli</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/gpt\">gpt</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8c3541c80c"
    },
    {
      "id": "mistral|https://mistral.ai/news/voxtral-tts",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Speaking of Voxtral",
      "headline": "Speaking of Voxtral",
      "url": "https://mistral.ai/news/voxtral-tts",
      "published_at": "2026-04-28T12:32:29Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d66d65c347"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/election-safeguards-update",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "An update on our election safeguards",
      "headline": "An update on our election safeguards",
      "url": "https://www.anthropic.com/news/election-safeguards-update",
      "published_at": "2026-04-28T08:59:45Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3eb54ff913"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-imagegen-is-on-the-path-to",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] ImageGen is on the Path to AGI",
      "headline": "[AINews] ImageGen is on the Path to AGI",
      "url": "https://www.latent.space/p/ainews-imagegen-is-on-the-path-to",
      "published_at": "2026-04-28T05:38:19Z",
      "description": "reflecting on the continued GPT-Image-2 explosion",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2a3b37f2bb"
    },
    {
      "id": "huggingface|https://huggingface.co/inclusionAI/Ling-2.6-flash",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "inclusionAI/Ling-2.6-flash · 137♥ · 943 dl",
      "headline": "HuggingFace model Ling-2.6-flash by inclusionAI reached 943 downloads and 137 likes.",
      "url": "https://huggingface.co/inclusionAI/Ling-2.6-flash",
      "published_at": "2026-04-28T03:27:56Z",
      "description": "",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:27:06Z",
      "anchor": "f5a15151ca"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/28/talkie",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Introducing talkie: a 13B vintage language model from 1930",
      "headline": "Introducing talkie: a 13B vintage language model from 1930",
      "url": "https://simonwillison.net/2026/Apr/28/talkie",
      "published_at": "2026-04-28T02:47:42Z",
      "description": "<p><strong><a href=\"https://talkie-lm.com/introducing-talkie\">Introducing talkie: a 13B vintage language model from 1930</a></strong></p> New project from <a href=\"https://nlevine.org\">Nick Levine</a>, <a href=\"http://www.cs.toronto.edu/~duvenaud/\">David Duvenaud</a>, and <a href=\"https://en.wikipedia.org/wiki/Alec_Radford\">Alec Radford</a> (of GPT, GPT-2, Whisper fame).</p> <p><a href=\"https://huggingface.co/talkie-lm/talkie-1930-13b-base\">talkie-1930-13b-base</a> (53.1 GB) is a \"13B language model trained on 260B tokens of historical pre-1931 English text\". </p> <p><a href=\"https://huggingface.co/talkie-lm/talkie-1930-13b-it\">talkie-1930-13b-it</a> (26.6 GB) is a checkpoint \"finetuned using a novel dataset of instruction-response pairs extracted from pre-1931 reference works\", designed to power a chat interface. You can <a href=\"https://talkie-lm.com/chat\">try that out here</a>.</p> <p>Both models are Apache 2.0 licensed. Since the training data for the base model is entirely out of copyright (the USA copyright cutoff date is currently January 1, 1931), I'm hoping they later decide to release the training data as well.</p> <p><em>Update</em> on that: <a href=\"https://twitter.com/status_effects/status/2049065134014726301\">Nick Levine on Twitter</a>:</p> <blockquote> <p>Will publish more on the corpus in the future (and do our best to share the data or at least scripts to reproduce it).</p> </blockquote> <p>Their report suggests some fascinating research objectives for this class of model, including:</p> <ul> <li>How good are these models at predicting the future? \"we calculated the surprisingness of short descriptions of historical events to a 13B model trained on pre-1931 text\"</li> <li>Can these models invent things that are past their knowledge cutoffs? \"As Demis Hassabis has asked, could a model trained up to 1911 independently discover General Relativity, as Einstein did in 1915?\"</li> <li>Can they be taught to program? \"Figure 3 (left-hand side) shows an early example of such a test, measuring how well models trained on pre-1931 text can, when given a few demonstration examples of <a href=\"https://github.com/openai/human-eval\">Python programs</a>, write new correct programs.\"</li> </ul> <p>I have a long-running interest in what I call \"vegan models\" - LLMs that are trained entirely on licensed or out-of-copyright data. I think the base model here qualifies, but the chat model isn't entirely pure due to the reliance on non-vegan models to help with the fine-tuning - emphasis mine:</p> <blockquote> <p>First, we generated instruction-response pairs from historical texts with regular structure, such as etiquette manuals, letter-writing manuals, cookbooks, dictionaries, encyclopedias, and poetry and fable collections (see Figure 7), and fine-tuned our base model on them using a simple chat format.</p> <p>Next, to improve instruction-following abilities, we generated synthetic prompts covering different types of tasks, such as summarizing documents, responding to direct information requests, and continuing multi-turn conversations coherently. We then ran online direct preference optimization on rollouts generated from these prompts, <strong>using Claude Sonnet 4.6 as a judge</strong>. [...]</p> <p>Finally, we did another round of supervised fine-tuning, this time on rejection-sampled <strong>multi-turn synthetic chats between Claude Opus 4.6 and talkie</strong>, to smooth out persistent rough edges in its conversational abilities.</p> </blockquote> <p>A major challenge in training talkie was avoiding accidental contamination, either from post-1931 text that snuck into the training corpus or from the fine-tuning assistance from modern LLMs introducing anachronistic knowledge to the chat model. </p> <p>I recently <a href=\"https://simonwillison.net/2026/Mar/30/mr-chatterbox/\">wrote about Mr. Chatterbox</a>, a similar vintage language model project which also had to turn to modern LLMs to create synthetic conversations to help train the chat model.</p> <p>The talkie team aspire to eventually move beyond this limitation:</p> <blockquote> <p>While we have tried to post-train talkie free from modern influence, reinforcement learning with AI feedback inevitably shapes talkie’s behavior anachronistically. (The 7B version of talkie emerged from RL speaking in listicles.) As we scale up, we hope to be able to use our vintage base models themselves as judges to enable a fully bootstrapped era-appropriate post-training pipeline.</p> </blockquote> <p>I couldn't resist hitting the talkie demo with my classic test prompt:</p> <blockquote> <p><code>Generate an SVG of a pelican riding a bicycle</code></p> </blockquote> <p>It replied:</p> <blockquote> <p>An SVG of a pelican riding a bicycle was generated in 1860. The pelican was represented as perched on the saddle, with its long bill pointed forward, and its legs astride the handles. The idea was suggested by the fact that pelicans are sometimes seen fishing on horseback, on the banks of the Rhine.</p> </blockquote> <p><small></small>Via <a href=\"https://news.ycombinator.com/item?id=47927903\">Hacker News</a></small></p> <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/local-llms\">local-llms</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/training-data\">training-data</a>, <a href=\"https://simonwillison.net/tags/ai-ethics\">ai-ethics</a>, <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2aa967ac01"
    },
    {
      "id": "openai|https://openai.com/index/our-commitment-to-community-safety",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Our commitment to community safety",
      "headline": "Our commitment to community safety",
      "url": "https://openai.com/index/our-commitment-to-community-safety",
      "published_at": "2026-04-28T00:00:00Z",
      "description": "Learn how OpenAI protects community safety in ChatGPT through model safeguards, misuse detection, policy enforcement, and collaboration with safety experts.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "608a66b75c"
    },
    {
      "id": "openai|https://openai.com/index/openai-on-aws",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "OpenAI models, Codex, and Managed Agents come to AWS",
      "headline": "OpenAI models, Codex, and Managed Agents come to AWS",
      "url": "https://openai.com/index/openai-on-aws",
      "published_at": "2026-04-28T00:00:00Z",
      "description": "OpenAI GPT models, Codex, and Managed Agents are now available on AWS, enabling enterprises to build secure AI in their AWS environments.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "c564b44d83"
    },
    {
      "id": "latentspace|https://www.latent.space/p/appliedintuition",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "Physical AI that Moves the World — Qasar Younis & Peter Ludwig, Applied Intuition",
      "headline": "Physical AI that Moves the World — Qasar Younis & Peter Ludwig, Applied Intuition",
      "url": "https://www.latent.space/p/appliedintuition",
      "published_at": "2026-04-27T23:02:37Z",
      "description": "Applied Intuition puts the AI in mining rigs, drones, trucks, warships and physical vehicles in the most adversarial environments imaginable. We dive in with their CEO and CTO as they emerge.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "e415a0b75e"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/theo-hourmouzis-general-manager-australia-new-zealand",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Anthropic Sydney office",
      "headline": "Anthropic Sydney office",
      "url": "https://www.anthropic.com/news/theo-hourmouzis-general-manager-australia-new-zealand",
      "published_at": "2026-04-27T22:30:22Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0aa9b8cfc5"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/27/now-deceased-agi-clause",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Tracking the history of the now-deceased OpenAI Microsoft AGI clause",
      "headline": "Tracking the history of the now-deceased OpenAI Microsoft AGI clause",
      "url": "https://simonwillison.net/2026/Apr/27/now-deceased-agi-clause",
      "published_at": "2026-04-27T18:38:17Z",
      "description": "<p>For many years, Microsoft and OpenAI's relationship has included a weird clause saying that, should AGI be achieved, Microsoft's commercial IP rights to OpenAI's technology would be null and void. That clause appeared to end today. I decided to try and track its expression over time on <a href=\"https://openai.com/\">openai.com</a>.</p> <p>OpenAI, July 22nd 2019 in <a href=\"https://openai.com/index/microsoft-invests-in-and-partners-with-openai/\">Microsoft invests in and partners with OpenAI to support us building beneficial AGI</a> (emphasis mine):</p> <blockquote> <p>OpenAI is producing a sequence of increasingly powerful AI technologies, which requires a lot of capital for computational power. The most obvious way to cover costs is to build a product, but that would mean changing our focus. Instead, we intend to license <strong>some of our pre-AGI technologies</strong>, with Microsoft becoming our preferred partner for commercializing them.</p> </blockquote> <p>But what <em>is</em> AGI? The <a href=\"https://openai.com/charter/\">OpenAI Charter</a> was first published in April 2018 and has remained unchanged at least since this <a href=\"https://web.archive.org/web/20190311213352/https://openai.com/charter/\">March 11th 2019 archive.org capture</a>:</p> <blockquote> <p>OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity.</p> </blockquote> <p>Here's the problem: if you're going to sign an agreement with Microsoft that is dependent on knowing when \"AGI\" has been achieved, you need something a little more concrete.</p> <p>In December 2024 <a href=\"https://www.theinformation.com/articles/microsoft-and-openai-wrangle-over-terms-of-their-blockbuster-partnership\">The Information reported the details</a> (summarized here outside of their paywall <a href=\"https://techcrunch.com/2024/12/26/microsoft-and-openai-have-a-financial-definition-of-agi-report/\">by TechCrunch</a>):</p> <blockquote> <p>Last year’s agreement between Microsoft and OpenAI, which hasn’t been disclosed, said AGI would be achieved only when OpenAI has developed systems that have the ability to generate the maximum total profits to which its earliest investors, including Microsoft, are entitled, according to documents OpenAI distributed to investors. Those profits total about $100 billion, the documents showed.</p> </blockquote> <p>So AGI is now whenever OpenAI's systems are capable of generating $100 billion in profit?</p> <p>In October 2025 the process changed to being judged by an \"independent expert panel\". In <a href=\"https://openai.com/index/next-chapter-of-microsoft-openai-partnership/\">The next chapter of the Microsoft–OpenAI partnership</a>:</p> <blockquote> <p>The agreement preserves key elements that have fueled this successful partnership—meaning OpenAI remains Microsoft’s frontier model partner and Microsoft continues to have exclusive IP rights and Azure API exclusivity until Artificial General Intelligence (AGI). [...]</p> <p>Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel. [...]</p> <p>Microsoft’s IP rights to research, defined as the confidential methods used in the development of models and systems, will remain until either the expert panel verifies AGI or through 2030, whichever is first.</p> </blockquote> <p>OpenAI on February 27th, 2026 in <a href=\"https://openai.com/index/continuing-microsoft-partnership/\">Joint Statement from OpenAI and Microsoft</a>:</p> <blockquote> <p><strong>AGI definition and processes are unchanged</strong>. The contractual definition of AGI and the process for determining if it has been achieved remains the same.</p> </blockquote> <p>OpenAI today, April 27th 2026 in <a href=\"https://openai.com/index/next-phase-of-microsoft-partnership/\">The next phase of the Microsoft OpenAI partnership</a> (emphasis mine):</p> <blockquote> <ul> <li>Microsoft will continue to have a license to OpenAI IP for models and products through 2032. Microsoft’s license will now be non-exclusive.</li> <li>Microsoft will no longer pay a revenue share to OpenAI.</li> <li>Revenue share payments from OpenAI to Microsoft continue through 2030, <strong>independent of OpenAI’s technology progress</strong>, at the same percentage but subject to a total cap.</li> </ul> </blockquote> <p>As far as I can tell \"independent of OpenAI’s technology progress\" is a declaration that the AGI clause is now dead. Here's The Verge coming to the same conclusion: <a href=\"https://www.theverge.com/ai-artificial-intelligence/918981/openai-microsoft-renegotiate-contract\">The AGI clause is dead</a>.</p> <p>My all-time favorite commentary on OpenAI's approach to AGI remains this 2023 hypothetical <a href=\"https://www.bloomberg.com/opinion/articles/2023-11-20/who-controls-openai\">by Matt Levine</a>:</p> <blockquote> <p>And the investors wailed and gnashed their teeth but it’s true, that is what they agreed to, and they had no legal recourse. And OpenAI’s new CEO, and its nonprofit board, cut them a check for their capped return and said “bye” and went back to running OpenAI for the benefit of humanity. It turned out that a benign, carefully governed artificial superintelligence is really good for humanity, and OpenAI quickly solved all of humanity’s problems and ushered in an age of peace and abundance in which nobody wanted for anything or needed any Microsoft products. And capitalism came to an end.</p> </blockquote> <p>Tags: <a href=\"https://simonwillison.net/tags/computer-history\">computer-history</a>, <a href=\"https://simonwillison.net/tags/microsoft\">microsoft</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8dd3754441"
    },
    {
      "id": "openai|https://openai.com/index/openai-available-at-fedramp-moderate",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "OpenAI available at FedRAMP Moderate",
      "headline": "OpenAI available at FedRAMP Moderate",
      "url": "https://openai.com/index/openai-available-at-fedramp-moderate",
      "published_at": "2026-04-27T14:00:00Z",
      "description": "OpenAI is available at FedRAMP Moderate authorization for ChatGPT Enterprise and the OpenAI API, enabling secure AI adoption for U.S. federal agencies.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "98135e4ab0"
    },
    {
      "id": "huggingface|https://huggingface.co/XiaomiMiMo/MiMo-V2.5",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "XiaomiMiMo/MiMo-V2.5 · 186♥ · 28323 dl",
      "headline": "Xiaomi's MiMo-V2.5 model on HuggingFace reached 28,323 downloads and 186 likes.",
      "url": "https://huggingface.co/XiaomiMiMo/MiMo-V2.5",
      "published_at": "2026-04-27T13:37:38Z",
      "description": "",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:27:18Z",
      "anchor": "389fa46d6e"
    },
    {
      "id": "huggingface|https://huggingface.co/XiaomiMiMo/MiMo-V2.5-Pro",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "XiaomiMiMo/MiMo-V2.5-Pro · 361♥ · 9914 dl",
      "headline": "Xiaomi's MiMo-V2.5-Pro text-generation model is available on HuggingFace with 9,914 downloads.",
      "url": "https://huggingface.co/XiaomiMiMo/MiMo-V2.5-Pro",
      "published_at": "2026-04-27T12:52:53Z",
      "description": "text-generation",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:27:30Z",
      "anchor": "51cc1d49bf"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/announcing-our-partnership-with-the-republic-of-korea",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Announcing our partnership with the Republic of Korea",
      "headline": "Announcing our partnership with the Republic of Korea",
      "url": "https://deepmind.google/blog/announcing-our-partnership-with-the-republic-of-korea",
      "published_at": "2026-04-27T07:00:06Z",
      "description": "Google DeepMind and Korea partner to accelerate scientific breakthroughs using frontier AI models",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "04f4f70b5c"
    },
    {
      "id": "openai|https://openai.com/index/next-phase-of-microsoft-partnership",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "The next phase of the Microsoft OpenAI partnership",
      "headline": "The next phase of the Microsoft OpenAI partnership",
      "url": "https://openai.com/index/next-phase-of-microsoft-partnership",
      "published_at": "2026-04-27T06:00:00Z",
      "description": "OpenAI and Microsoft announce an amended agreement that simplifies the partnership, adds long-term clarity, and supports continued AI innovation at scale.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "4344f5cc2d"
    },
    {
      "id": "openai|https://openai.com/index/open-source-codex-orchestration-symphony",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "An open-source spec for orchestration: Symphony",
      "headline": "An open-source spec for orchestration: Symphony",
      "url": "https://openai.com/index/open-source-codex-orchestration-symphony",
      "published_at": "2026-04-27T00:00:00Z",
      "description": "Learn how Symphony, an open-source spec for Codex orchestration, turns issue trackers into always-on agent systems—boosting engineering output and reducing context switching.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "e1044c2b91"
    },
    {
      "id": "openai|https://openai.com/index/choco",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Choco automates food distribution with AI agents",
      "headline": "Choco automates food distribution with AI agents",
      "url": "https://openai.com/index/choco",
      "published_at": "2026-04-27T00:00:00Z",
      "description": "How Choco used OpenAI APIs to streamline food distribution, boost productivity, and unlock growth—an in-depth customer story on real-world AI impact.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "56bfd416fb"
    },
    {
      "id": "openai|https://openai.com/index/our-principles",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Our principles",
      "headline": "Our principles",
      "url": "https://openai.com/index/our-principles",
      "published_at": "2026-04-26T16:00:00Z",
      "description": "Our mission is to ensure that AGI benefits all of humanity. Sam Altman shares five principles that guide our work.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2894b56032"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/25/romain-huet",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "Quoting Romain Huet",
      "headline": "Quoting Romain Huet",
      "url": "https://simonwillison.net/2026/Apr/25/romain-huet",
      "published_at": "2026-04-25T12:06:55Z",
      "description": "<blockquote cite=\"https://twitter.com/romainhuet/status/2047955381578838357\"><p>Since GPT-5.4, we’ve unified Codex and the main model into a single system, so there’s no separate coding line anymore.</p> <p>GPT-5.5 takes this further, with strong gains in agentic coding, computer use, and any task on a computer.</p></blockquote> <p class=\"cite\">&mdash; <a href=\"https://twitter.com/romainhuet/status/2047955381578838357\">Romain Huet</a>, confirming OpenAI won't release a GPT-5.5-Codex model</p> <p>Tags: <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/gpt\">gpt</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5f0bc863ef"
    },
    {
      "id": "huggingface|https://huggingface.co/SeeSee21/Z-Anime",
      "anchor": "81e30bd5d5",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "SeeSee21/Z-Anime · 117♥ · 1647 dl",
      "headline": "SeeSee21/Z-Anime · 117♥ · 1647 dl",
      "url": "https://huggingface.co/SeeSee21/Z-Anime",
      "published_at": "2026-04-25T11:37:04Z",
      "discovered_at": "2026-05-04T04:18:19Z",
      "description": "text-to-image",
      "summarize": true,
      "summarized": false,
      "summarized_at": null
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-deepseek-v4-pro-16t-a49b-and",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips",
      "headline": "[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips",
      "url": "https://www.latent.space/p/ainews-deepseek-v4-pro-16t-a49b-and",
      "published_at": "2026-04-25T05:00:48Z",
      "description": "The prodigal Tiger returns... but is no longer the benchmarks leader.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a598ae61dc"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/25/gpt-5-5-prompting-guide",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "GPT-5.5 prompting guide",
      "headline": "GPT-5.5 prompting guide",
      "url": "https://simonwillison.net/2026/Apr/25/gpt-5-5-prompting-guide",
      "published_at": "2026-04-25T04:13:36Z",
      "description": "<p><strong><a href=\"https://developers.openai.com/api/docs/guides/prompt-guidance?model=gpt-5.5\">GPT-5.5 prompting guide</a></strong></p> Now that GPT-5.5 is <a href=\"https://developers.openai.com/api/docs/models/gpt-5.5\">available in the API</a>, OpenAI have released a wealth of useful tips on how best to prompt the new model.</p> <p>Here's a neat trick they recommend for applications that might spend considerable time thinking before returning a user-visible response:</p> <blockquote> <p><code>Before any tool calls for a multi-step task, send a short user-visible update that acknowledges the request and states the first step. Keep it to one or two sentences.</code></p> </blockquote> <p>I've already noticed their Codex app doing this, and it does make longer running tasks feel less like the model has crashed.</p> <p>OpenAI suggest running the following in Codex to upgrade your existing code using advice embedded in their <code>openai-docs</code> skill:</p> <blockquote> <p><code>$openai-docs migrate this project to gpt-5.5</code></p> </blockquote> <p>The upgrade guide the coding agent will follow <a href=\"https://github.com/openai/skills/blob/724cd511c96593f642bddf13187217aa155d2554/skills/.curated/openai-docs/references/upgrade-guide.md#model-string--light-prompt-rewrite\">is this one</a>, which even includes light instructions on how to rewrite prompts to better fit the model.</p> <p>Also relevant is the <a href=\"https://developers.openai.com/api/docs/guides/latest-model\">Using GPT-5.5 guide</a>, which opens with this warning:</p> <blockquote> <p>To get the most out of GPT-5.5, treat it as a new model family to tune for, not a drop-in replacement for <code>gpt-5.2</code> or <code>gpt-5.4</code>. Begin migration with a fresh baseline instead of carrying over every instruction from an older prompt stack. Start with the smallest prompt that preserves the product contract, then tune reasoning effort, verbosity, tool descriptions, and output format against representative examples.</p> </blockquote> <p>Interesting to see OpenAI recommend starting from scratch rather than trusting that existing prompts optimized for previous models will continue to work effectively with GPT-5.5. <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/prompt-engineering\">prompt-engineering</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/gpt\">gpt</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "20ac9c8df0"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/24/llm",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "llm 0.31",
      "headline": "llm 0.31",
      "url": "https://simonwillison.net/2026/Apr/24/llm",
      "published_at": "2026-04-24T23:35:07Z",
      "description": "<p><strong>Release:</strong> <a href=\"https://github.com/simonw/llm/releases/tag/0.31\">llm 0.31</a></p> <blockquote> <ul> <li>New GPT-5.5 OpenAI model: <code>llm -m gpt-5.5</code>. <a href=\"https://github.com/simonw/llm/issues/1418\">#1418</a></li> <li>New option to set the <a href=\"https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_new_params_and_tools#1-verbosity-parameter\">text verbosity level</a> for GPT-5+ OpenAI models: <code>-o verbosity low</code>. Values are <code>low</code>, <code>medium</code>, <code>high</code>.</li> <li>New option for setting the <a href=\"https://developers.openai.com/api/docs/guides/images-vision#choose-an-image-detail-level\">image detail level</a> used for image attachments to OpenAI models: <code>-o image_detail low</code> - values are <code>low</code>, <code>high</code> and <code>auto</code>, and GPT-5.4 and 5.5 also accept <code>original</code>.</li> <li>Models listed in <code>extra-openai-models.yaml</code> are now also registered as asynchronous. <a href=\"https://github.com/simonw/llm/issues/1395\">#1395</a></li> </ul> </blockquote> <p>Tags: <a href=\"https://simonwillison.net/tags/gpt\">gpt</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/llm\">llm</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "25aaaf51e4"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/anthropic-amazon-compute",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute",
      "headline": "Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute",
      "url": "https://www.anthropic.com/news/anthropic-amazon-compute",
      "published_at": "2026-04-24T15:50:13Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "73556380c5"
    },
    {
      "id": "anthropic|https://www.anthropic.com/engineering/infrastructure-noise",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Quantifying infrastructure noise in agentic coding evals",
      "headline": "Quantifying infrastructure noise in agentic coding evals",
      "url": "https://www.anthropic.com/engineering/infrastructure-noise",
      "published_at": "2026-04-24T14:21:28Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ccfb91b274"
    },
    {
      "id": "anthropic|https://www.anthropic.com/engineering/managed-agents",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Scaling Managed Agents: Decoupling the brain from the hands",
      "headline": "Scaling Managed Agents: Decoupling the brain from the hands",
      "url": "https://www.anthropic.com/engineering/managed-agents",
      "published_at": "2026-04-24T14:19:43Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "aae5c7e03e"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/24/deepseek-v4",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "DeepSeek V4 - almost on the frontier, a fraction of the price",
      "headline": "DeepSeek V4 - almost on the frontier, a fraction of the price",
      "url": "https://simonwillison.net/2026/Apr/24/deepseek-v4",
      "published_at": "2026-04-24T06:01:04Z",
      "description": "<p>Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) <a href=\"https://simonwillison.net/2025/Dec/1/deepseek-v32/\">last December</a>. They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, <a href=\"https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro\">DeepSeek-V4-Pro</a> and <a href=\"https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash\">DeepSeek-V4-Flash</a>.</p> <p>Both models are 1 million token context Mixture of Experts. Pro is 1.6T total parameters, 49B active. Flash is 284B total, 13B active. They're using the standard MIT license.</p> <p>I think this makes DeepSeek-V4-Pro the new largest open weights model. It's larger than Kimi K2.6 (1.1T) and GLM-5.1 (754B) and more than twice the size of DeepSeek V3.2 (685B).</p> <p>Pro is 865GB on Hugging Face, Flash is 160GB. I'm hoping that a lightly quantized Flash will run on my 128GB M5 MacBook Pro. It's <em>possible</em> the Pro model may run on it if I can stream just the necessary active experts from disk.</p> <p>For the moment I tried the models out via <a href=\"https://openrouter.ai/\">OpenRouter</a>, using <a href=\"https://github.com/simonw/llm-openrouter\">llm-openrouter</a>:</p> <pre><code>llm install llm-openrouter llm openrouter refresh llm -m openrouter/deepseek/deepseek-v4-pro 'Generate an SVG of a pelican riding a bicycle' </code></pre> <p>Here's the pelican <a href=\"https://gist.github.com/simonw/4a7a9e75b666a58a0cf81495acddf529\">for DeepSeek-V4-Flash</a>:</p> <p><img src=\"https://static.simonwillison.net/static/2026/deepseek-v4-flash.png\" alt=\"Excellent bicycle - good frame shape, nice chain, even has a reflector on the front wheel. Pelican has a mean looking expression but has its wings on the handlebars and feet on the pedals. Pouch is a little sharp.\" style=\"max-width: 100%;\" /></p> <p>And <a href=\"https://gist.github.com/simonw/9e8dfed68933ab752c9cf27a03250a7c\">for DeepSeek-V4-Pro</a>:</p> <p><img src=\"https://static.simonwillison.net/static/2026/deepseek-v4-pro.png\" alt=\"Another solid bicycle, albeit the spokes are a little jagged and the frame is compressed a bit. Pelican has gone a bit wrong - it has a VERY large body, only one wing, a weirdly hairy backside and generally loos like it was drown be a different artist from the bicycle.\" style=\"max-width: 100%;\" /></p> <p>For comparison, take a look at the pelicans I got from <a href=\"https://simonwillison.net/2025/Dec/1/deepseek-v32/\">DeepSeek V3.2 in December</a>, <a href=\"https://simonwillison.net/2025/Aug/22/deepseek-31/\">V3.1 in August</a>, and <a href=\"https://simonwillison.net/2025/Mar/24/deepseek/\">V3-0324 in March 2025</a>.</p> <p>So the pelicans are pretty good, but what's really notable here is the <em>cost</em>. DeepSeek V4 is a very, very inexpensive model.</p> <p>This is <a href=\"https://api-docs.deepseek.com/quick_start/pricing\">DeepSeek's pricing page</a>. They're charging $0.14/million tokens input and $0.28/million tokens output for Flash, and $1.74/million input and $3.48/million output for Pro.</p> <p>Here's a comparison table with the frontier models from Gemini, OpenAI and Anthropic:</p> <center> <table> <thead> <tr> <th>Model</th> <th>Input ($/M)</th> <th>Output ($/M)</th> </tr> </thead> <tbody> <tr> <td><strong>DeepSeek V4 Flash</strong></td> <td>$0.14</td> <td>$0.28</td> </tr> <tr> <td>GPT-5.4 Nano</td> <td>$0.20</td> <td>$1.25</td> </tr> <tr> <td>Gemini 3.1 Flash-Lite</td> <td>$0.25</td> <td>$1.50</td> </tr> <tr> <td>Gemini 3 Flash Preview</td> <td>$0.50</td> <td>$3</td> </tr> <tr> <td>GPT-5.4 Mini</td> <td>$0.75</td> <td>$4.50</td> </tr> <tr> <td>Claude Haiku 4.5</td> <td>$1</td> <td>$5</td> </tr> <tr> <td><strong>DeepSeek V4 Pro</strong></td> <td>$1.74</td> <td>$3.48</td> </tr> <tr> <td>Gemini 3.1 Pro</td> <td>$2</td> <td>$12</td> </tr> <tr> <td>GPT-5.4</td> <td>$2.50</td> <td>$15</td> </tr> <tr> <td>Claude Sonnet 4.6</td> <td>$3</td> <td>$15</td> </tr> <tr> <td>Claude Opus 4.7</td> <td>$5</td> <td>$25</td> </tr> <tr> <td>GPT-5.5</td> <td>$5</td> <td>$30</td> </tr> </tbody> </table> </center> <p>DeepSeek-V4-Flash is the cheapest of the small models, beating even OpenAI's GPT-5.4 Nano. DeepSeek-V4-Pro is the cheapest of the larger frontier models.</p> <p>This note from <a href=\"https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash/blob/main/DeepSeek_V4.pdf\">the DeepSeek paper</a> helps explain why they can price these models so low - they've focused a great deal on efficiency with this release, especially for longer context prompts:</p> <blockquote> <p>In the scenario of 1M-token context, even DeepSeek-V4-Pro, which has a larger number of activated parameters, attains only 27% of the single-token FLOPs (measured in equivalent FP8 FLOPs) and 10% of the KV cache size relative to DeepSeek-V3.2. Furthermore, DeepSeek-V4-Flash, with its smaller number of activated parameters, pushes efficiency even further: in the 1M-token context setting, it achieves only 10% of the single-token FLOPs and 7% of the KV cache size compared with DeepSeek-V3.2.</p> </blockquote> <p>DeepSeek's self-reported benchmarks <a href=\"https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash/blob/main/DeepSeek_V4.pdf\">in their paper</a> show their Pro model competitive with those other frontier models, albeit with this note:</p> <blockquote> <p>Through the expansion of reasoning tokens, DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks. Nevertheless, its performance falls marginally short of GPT-5.4 and Gemini-3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.</p> </blockquote> <p>I'm keeping an eye on <a href=\"https://huggingface.co/unsloth/models\">huggingface.co/unsloth/models</a> as I expect the Unsloth team will have a set of quantized versions out pretty soon. It's going to be very interesting to see how well that Flash model runs on my own machine.</p> <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm\">llm</a>, <a href=\"https://simonwillison.net/tags/llm-pricing\">llm-pricing</a>, <a href=\"https://simonwillison.net/tags/pelican-riding-a-bicycle\">pelican-riding-a-bicycle</a>, <a href=\"https://simonwillison.net/tags/deepseek\">deepseek</a>, <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a>, <a href=\"https://simonwillison.net/tags/openrouter\">openrouter</a>, <a href=\"https://simonwillison.net/tags/ai-in-china\">ai-in-china</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "87fb3bbba9"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-gpt-55-and-openai-codex-superapp",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] GPT 5.5 and OpenAI Codex Superapp",
      "headline": "[AINews] GPT 5.5 and OpenAI Codex Superapp",
      "url": "https://www.latent.space/p/ainews-gpt-55-and-openai-codex-superapp",
      "published_at": "2026-04-24T04:40:43Z",
      "description": "Spud lives!",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "7fb0ed753a"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/anthropic-nec",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Anthropic and NEC partner to build AI-native engineering at scale in Japan",
      "headline": "Anthropic and NEC partner to build AI-native engineering at scale in Japan",
      "url": "https://www.anthropic.com/news/anthropic-nec",
      "published_at": "2026-04-24T04:06:50Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "da026c8c77"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/24/recent-claude-code-quality-reports",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "An update on recent Claude Code quality reports",
      "headline": "An update on recent Claude Code quality reports",
      "url": "https://simonwillison.net/2026/Apr/24/recent-claude-code-quality-reports",
      "published_at": "2026-04-24T01:31:25Z",
      "description": "<p><strong><a href=\"https://www.anthropic.com/engineering/april-23-postmortem\">An update on recent Claude Code quality reports</a></strong></p> It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems.</p> <p>The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users.</p> <p>Anthropic's postmortem describes these in detail. This one in particular stood out to me:</p> <blockquote> <p>On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive.</p> </blockquote> <p>I <em>frequently</em> have Claude Code sessions which I leave for an hour (or often a day or longer) before returning to them. Right now I have 11 of those (according to <code>ps aux | grep 'claude '</code>) and that's after closing down dozens more the other day.</p> <p>I estimate I spend more time prompting in these \"stale\" sessions than sessions that I've recently started!</p> <p>If you're building agentic systems it's worth reading this article in detail - the kinds of bugs that affect harnesses are deeply complicated, even if you put aside the inherent non-deterministic nature of the models themselves. <p><small></small>Via <a href=\"https://news.ycombinator.com/item?id=47878905\">Hacker News</a></small></p> <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/prompt-engineering\">prompt-engineering</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/anthropic\">anthropic</a>, <a href=\"https://simonwillison.net/tags/coding-agents\">coding-agents</a>, <a href=\"https://simonwillison.net/tags/claude-code\">claude-code</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "99c086129b"
    },
    {
      "id": "huggingface|https://huggingface.co/poolside/Laguna-XS.2",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "poolside/Laguna-XS.2 · 186♥ · 7573 dl",
      "headline": "Laguna-XS.2 is a text-generation model on HuggingFace by poolside with 186 likes and 7,573 downloads.",
      "url": "https://huggingface.co/poolside/Laguna-XS.2",
      "published_at": "2026-04-23T20:50:45Z",
      "description": "text-generation",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:27:37Z",
      "anchor": "b9127c2036"
    },
    {
      "id": "simonw|https://simonwillison.net/2026/Apr/23/gpt-5-5",
      "source_id": "simonw",
      "source_name": "Simon Willison",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "simonw"
      ],
      "title": "A pelican for GPT-5.5 via the semi-official Codex backdoor API",
      "headline": "A pelican for GPT-5.5 via the semi-official Codex backdoor API",
      "url": "https://simonwillison.net/2026/Apr/23/gpt-5-5",
      "published_at": "2026-04-23T19:59:47Z",
      "description": "<p><a href=\"https://openai.com/index/introducing-gpt-5-5/\">GPT-5.5 is out</a>. It's available in OpenAI Codex and is rolling out to paid ChatGPT subscribers. I've had some preview access and found it to be a fast, effective and highly capable model. As is usually the case these days, it's hard to put into words what's good about it - I ask it to build things and it builds exactly what I ask for!</p> <p>There's one notable omission from today's release - the API:</p> <blockquote> <p>API deployments require different safeguards and we are working closely with partners and customers on the safety and security requirements for serving it at scale. We'll bring GPT‑5.5 and GPT‑5.5 Pro to the API very soon.</p> </blockquote> <p>When I run my <a href=\"https://simonwillison.net/tags/pelican-riding-a-bicycle/\">pelican benchmark</a> I always prefer to use an API, to avoid hidden system prompts in ChatGPT or other agent harnesses from impacting the results.</p> <h4 id=\"the-openclaw-backdoor\">The OpenClaw backdoor</h4> <p>One of the ongoing tension points in the AI world over the past few months has concerned how agent harnesses like OpenClaw and Pi interact with the APIs provided by the big providers.</p> <p>Both OpenAI and Anthropic offer popular monthly subscriptions which provide access to their models at a significant discount to their raw API.</p> <p>OpenClaw integrated directly with this mechanism, and was then <a href=\"https://www.theverge.com/ai-artificial-intelligence/907074/anthropic-openclaw-claude-subscription-ban\">blocked from doing so</a> by Anthropic. This kicked off a whole thing. OpenAI - who recently hired OpenClaw creator Peter Steinberger - saw an opportunity for an easy karma win and announced that OpenClaw was welcome to continue integrating with OpenAI's subscriptions via the same mechanism used by their (open source) Codex CLI tool.</p> <p>Does this mean <em>anyone</em> can write code that integrates with OpenAI's Codex-specific APIs to hook into those existing subscriptions?</p> <p>The other day <a href=\"https://twitter.com/jeremyphoward/status/2046537816834965714\">Jeremy Howard asked</a>:</p> <blockquote> <p>Anyone know whether OpenAI officially supports the use of the <code>/backend-api/codex/responses</code> endpoint that Pi and Opencode (IIUC) uses?</p> </blockquote> <p>It turned out that on March 30th OpenAI's Romain Huet <a href=\"https://twitter.com/romainhuet/status/2038699202834841962\">had tweeted</a>:</p> <blockquote> <p>We want people to be able to use Codex, and their ChatGPT subscription, wherever they like! That means in the app, in the terminal, but also in JetBrains, Xcode, OpenCode, Pi, and now Claude Code.</p> <p>That’s why Codex CLI and Codex app server are open source too! 🙂</p> </blockquote> <p>And Peter Steinberger <a href=\"https://twitter.com/steipete/status/2046775849769148838\">replied to Jeremy</a> that:</p> <blockquote> <p>OpenAI sub is officially supported.</p> </blockquote> <h4 id=\"llm-openai-via-codex\">llm-openai-via-codex</h4> <p>So... I had Claude Code reverse-engineer the <a href=\"https://github.com/openai/codex\">openai/codex</a> repo, figure out how authentication tokens were stored and build me <a href=\"https://github.com/simonw/llm-openai-via-codex\">llm-openai-via-codex</a>, a new plugin for <a href=\"https://llm.datasette.io/\">LLM</a> which picks up your existing Codex subscription and uses it to run prompts!</p> <p>(With hindsight I wish I'd used GPT-5.4 or the GPT-5.5 preview, it would have been funnier. I genuinely considered rewriting the project from scratch using Codex and GPT-5.5 for the sake of the joke, but decided not to spend any more time on this!)</p> <p>Here's how to use it:</p> <ol> <li>Install Codex CLI, buy an OpenAI plan, login to Codex</li> <li>Install LLM: <code>uv tool install llm</code> </li> <li>Install the new plugin: <code>llm install llm-openai-via-codex</code> </li> <li>Start prompting: <code>llm -m openai-codex/gpt-5.5 'Your prompt goes here'</code> </li> </ol> <p>All existing LLM features should also work - use <code>-a filepath.jpg/URL</code> to attach an image, <code>llm chat -m openai-codex/gpt-5.5</code> to start an ongoing chat, <code>llm logs</code> to view logged conversations and <code>llm --tool ...</code> to <a href=\"https://llm.datasette.io/en/stable/tools.html\">try it out with tool support</a>.</p> <h4 id=\"and-some-pelicans\">And some pelicans</h4> <p>Let's generate a pelican!</p> <div class=\"highlight highlight-source-shell\"><pre>llm install llm-openai-via-codex llm -m openai-codex/gpt-5.5 <span class=\"pl-s\"><span class=\"pl-pds\">'</span>Generate an SVG of a pelican riding a bicycle<span class=\"pl-pds\">'</span></span></pre></div> <p>Here's <a href=\"https://gist.github.com/simonw/edda1d98f7ba07fd95eeff473cb16634\">what I got back</a>:</p> <p><img src=\"https://static.simonwillison.net/static/2026/gpt-5.5-pelican.png\" alt=\"It is a bit mangled to be honest - good beak, pelican body shapes are slightly weird, legs do at least extend to the pedals, bicycle frame is not quite right.\" style=\"max-width: 100%;\" /></p> <p>I've seen better <a href=\"https://simonwillison.net/2026/Mar/17/mini-and-nano/#pelicans\">from GPT-5.4</a>, so I tagged on <code>-o reasoning_effort xhigh</code> and <a href=\"https://gist.github.com/simonw/a6168e4165a258e4d664aeae8e602cc5\">tried again</a>:</p> <p>That one took almost four minutes to generate, but I think it's a much better effort.</p> <p><img src=\"https://static.simonwillison.net/static/2026/gpt-5.5-pelican-xhigh.png\" alt=\"Pelican has gradients now, body is much better put together, bicycle is nearly the right shape albeit with one extra bar between pedals and front wheel, clearly a better image overall.\" style=\"max-width: 100%;\" /></p> <p>If you compare the SVG code (<a href=\"https://gist.github.com/simonw/edda1d98f7ba07fd95eeff473cb16634#response\">default</a>, <a href=\"https://gist.github.com/simonw/a6168e4165a258e4d664aeae8e602cc5#response\">xhigh</a>) the <code>xhigh</code> one took a very different approach, which is much more CSS-heavy - as demonstrated by those gradients. <code>xhigh</code> used 9,322 reasoning tokens where the default used just 39.</p> <h4 id=\"a-few-more-notes-on-gpt-5-5\">A few more notes on GPT-5.5</h4> <p>One of the most notable things about GPT-5.5 is the pricing. Once it goes live in the API it's <a href=\"https://openai.com/index/introducing-gpt-5-5/#availability-and-pricing\">going to be priced</a> at <em>twice</em> the cost of GPT-5.4 - $5 per 1M input tokens and $30 per 1M output tokens, where 5.4 is $2.5 and $15.</p> <p>GPT-5.5 Pro will be even more: $30 per 1M input tokens and $180 per 1M output tokens.</p> <p>GPT-5.4 will remain available. At half the price of 5.5 this feels like 5.4 is to 5.5 as Claude Sonnet is to Claude Opus.</p> <p>Ethan Mollick has a <a href=\"https://www.oneusefulthing.org/p/sign-of-the-future-gpt-55\">detailed review of GPT-5.5</a> where he put it (and GPT-5.5 Pro) through an array of interesting challenges. His verdict: the jagged frontier continues to hold, with GPT-5.5 excellent at some things and challenged by others in a way that remains difficult to predict.</p> <p>Tags: <a href=\"https://simonwillison.net/tags/ai\">ai</a>, <a href=\"https://simonwillison.net/tags/openai\">openai</a>, <a href=\"https://simonwillison.net/tags/generative-ai\">generative-ai</a>, <a href=\"https://simonwillison.net/tags/chatgpt\">chatgpt</a>, <a href=\"https://simonwillison.net/tags/llms\">llms</a>, <a href=\"https://simonwillison.net/tags/llm\">llm</a>, <a href=\"https://simonwillison.net/tags/llm-pricing\">llm-pricing</a>, <a href=\"https://simonwillison.net/tags/pelican-riding-a-bicycle\">pelican-riding-a-bicycle</a>, <a href=\"https://simonwillison.net/tags/llm-reasoning\">llm-reasoning</a>, <a href=\"https://simonwillison.net/tags/llm-release\">llm-release</a>, <a href=\"https://simonwillison.net/tags/codex-cli\">codex-cli</a>, <a href=\"https://simonwillison.net/tags/gpt\">gpt</a></p>",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6ecec9bab8"
    },
    {
      "id": "latentspace|https://www.latent.space/p/unsupervised-learning-2026",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)",
      "headline": "AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)",
      "url": "https://www.latent.space/p/unsupervised-learning-2026",
      "published_at": "2026-04-23T19:37:19Z",
      "description": "Note: This episode was recorded just after AIE Europe, but before the Cursor-xAI deal.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "fe7425e042"
    },
    {
      "id": "openai|https://openai.com/index/introducing-gpt-5-5",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Introducing GPT-5.5",
      "headline": "Introducing GPT-5.5",
      "url": "https://openai.com/index/introducing-gpt-5-5",
      "published_at": "2026-04-23T11:00:00Z",
      "description": "Introducing GPT-5.5, our smartest model yet—faster, more capable, and built for complex tasks like coding, research, and data analysis across tools.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "835f8ca923"
    },
    {
      "id": "openai|https://openai.com/index/gpt-5-5-system-card",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "GPT-5.5 System Card",
      "headline": "GPT-5.5 System Card",
      "url": "https://openai.com/index/gpt-5-5-system-card",
      "published_at": "2026-04-23T11:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b065e0ec9e"
    },
    {
      "id": "huggingface|https://huggingface.co/z-lab/Qwen3.6-27B-DFlash",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "z-lab/Qwen3.6-27B-DFlash · 203♥ · 17016 dl",
      "headline": "HuggingFace model Qwen3.6-27B-DFlash for text generation reached 17,016 downloads.",
      "url": "https://huggingface.co/z-lab/Qwen3.6-27B-DFlash",
      "published_at": "2026-04-23T10:15:06Z",
      "description": "text-generation",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:28:23Z",
      "anchor": "2326d3b9f6"
    },
    {
      "id": "openai|https://openai.com/academy/what-is-codex",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "What is Codex?",
      "headline": "What is Codex?",
      "url": "https://openai.com/academy/what-is-codex",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Learn how Codex helps you go beyond chat by automating tasks, connecting tools, and producing real outputs like docs and dashboards.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a75ec8f16e"
    },
    {
      "id": "openai|https://openai.com/academy/codex-how-to-start",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "How to get started with Codex",
      "headline": "How to get started with Codex",
      "url": "https://openai.com/academy/codex-how-to-start",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Learn how to get started with Codex by setting up projects, creating threads, and completing your first tasks with step-by-step guidance.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ec74c01a93"
    },
    {
      "id": "openai|https://openai.com/academy/codex-settings",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Codex settings",
      "headline": "Codex settings",
      "url": "https://openai.com/academy/codex-settings",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Learn how to configure Codex settings, including personalization, detail level, and permissions, to run tasks smoothly and customize your workflow.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "1ceca72f36"
    },
    {
      "id": "openai|https://openai.com/academy/working-with-codex",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Working with Codex",
      "headline": "Working with Codex",
      "url": "https://openai.com/academy/working-with-codex",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Learn how to set up your Codex workspace, create threads and projects, manage files, and start completing tasks with step-by-step guidance.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "7aede4bc19"
    },
    {
      "id": "openai|https://openai.com/academy/codex-plugins-and-skills",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Plugins and skills",
      "headline": "Plugins and skills",
      "url": "https://openai.com/academy/codex-plugins-and-skills",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Learn how to use Codex plugins and skills to connect tools, access data, and follow repeatable workflows to automate tasks and improve results.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "aca0f3beb3"
    },
    {
      "id": "openai|https://openai.com/academy/top-10-use-cases-codex-for-work",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Top 10 uses for Codex at work",
      "headline": "Top 10 uses for Codex at work",
      "url": "https://openai.com/academy/top-10-use-cases-codex-for-work",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Explore 10 practical Codex use cases to automate tasks, create deliverables, and turn real inputs into outputs across tools, files, and workflows.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "218e20a528"
    },
    {
      "id": "openai|https://openai.com/academy/codex-automations",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Automations",
      "headline": "Automations",
      "url": "https://openai.com/academy/codex-automations",
      "published_at": "2026-04-23T10:00:00Z",
      "description": "Learn how to automate tasks in Codex using schedules and triggers to create reports, summaries, and recurring workflows without manual effort.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "90bae5f547"
    },
    {
      "id": "mistral|https://mistral.ai/news/debugging-memory-leak-in-vllm",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Heaps do lie: debugging a memory leak in vLLM.",
      "headline": "Heaps do lie: debugging a memory leak in vLLM.",
      "url": "https://mistral.ai/news/debugging-memory-leak-in-vllm",
      "published_at": "2026-04-23T09:54:51Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "58900811a8"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-tasteful-tokenmaxxing",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] Tasteful Tokenmaxxing",
      "headline": "[AINews] Tasteful Tokenmaxxing",
      "url": "https://www.latent.space/p/ainews-tasteful-tokenmaxxing",
      "published_at": "2026-04-23T02:45:37Z",
      "description": "a quiet day lets us reflect on the top conversation that AI leaders are having everywhere.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "94bf67265f"
    },
    {
      "id": "openai|https://openai.com/index/gpt-5-5-bio-bug-bounty",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "GPT-5.5 Bio Bug Bounty",
      "headline": "GPT-5.5 Bio Bug Bounty",
      "url": "https://openai.com/index/gpt-5-5-bio-bug-bounty",
      "published_at": "2026-04-23T00:00:00Z",
      "description": "Explore the GPT-5.5 Bio Bug Bounty: a red-teaming challenge to find universal jailbreaks for bio safety risks, with rewards up to $25,000.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6553129a1a"
    },
    {
      "id": "xai|https://x.ai/news/grok-voice-think-fast-1",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Voice Think Fast 1.0",
      "headline": "Grok Voice Think Fast 1.0",
      "url": "https://x.ai/news/grok-voice-think-fast-1",
      "published_at": "2026-04-23T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8bf072c7ec"
    },
    {
      "id": "latentspace|https://www.latent.space/p/shopify",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "Shopify’s AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO",
      "headline": "Shopify’s AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO",
      "url": "https://www.latent.space/p/shopify",
      "published_at": "2026-04-22T19:33:49Z",
      "description": "A rare interview with Shopify's CTO on -everything- that Shopify is doing to maximize AI for their customers, with exclusive data on their own AI adoption.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "67d6c050ca"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/81k-economics",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "What 81,000 people told us about the economics of AI",
      "headline": "What 81,000 people told us about the economics of AI",
      "url": "https://www.anthropic.com/research/81k-economics",
      "published_at": "2026-04-22T17:04:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d52a17a088"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/economic-index-survey-announcement",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Announcing the Anthropic Economic Index Survey",
      "headline": "Announcing the Anthropic Economic Index Survey",
      "url": "https://www.anthropic.com/research/economic-index-survey-announcement",
      "published_at": "2026-04-22T17:01:04Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "23b5b7c1ac"
    },
    {
      "id": "openai|https://openai.com/index/making-chatgpt-better-for-clinicians",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Making ChatGPT better for clinicians",
      "headline": "Making ChatGPT better for clinicians",
      "url": "https://openai.com/index/making-chatgpt-better-for-clinicians",
      "published_at": "2026-04-22T15:00:00Z",
      "description": "OpenAI makes ChatGPT for Clinicians free for verified U.S. physicians, nurse practitioners, and pharmacists, supporting clinical care, documentation, and research.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "f046f616a0"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/decoupled-diloco",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Decoupled DiLoCo: A new frontier for resilient, distributed AI training",
      "headline": "Decoupled DiLoCo: A new frontier for resilient, distributed AI training",
      "url": "https://deepmind.google/blog/decoupled-diloco",
      "published_at": "2026-04-22T10:20:03Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3a20f1402a"
    },
    {
      "id": "openai|https://openai.com/academy/workspace-agents",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Workspace agents",
      "headline": "Workspace agents",
      "url": "https://openai.com/academy/workspace-agents",
      "published_at": "2026-04-22T10:00:00Z",
      "description": "Learn how to build, use, and scale workspace agents in ChatGPT to automate repeatable workflows, connect tools, and streamline team operations.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "002f15c7a8"
    },
    {
      "id": "openai|https://openai.com/index/speeding-up-agentic-workflows-with-websockets",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Speeding up agentic workflows with WebSockets in the Responses API",
      "headline": "Speeding up agentic workflows with WebSockets in the Responses API",
      "url": "https://openai.com/index/speeding-up-agentic-workflows-with-websockets",
      "published_at": "2026-04-22T10:00:00Z",
      "description": "A deep dive into the Codex agent loop, showing how WebSockets and connection-scoped caching reduced API overhead and improved model latency.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0fa5c1c4c0"
    },
    {
      "id": "openai|https://openai.com/index/introducing-workspace-agents-in-chatgpt",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Introducing workspace agents in ChatGPT",
      "headline": "Introducing workspace agents in ChatGPT",
      "url": "https://openai.com/index/introducing-workspace-agents-in-chatgpt",
      "published_at": "2026-04-22T10:00:00Z",
      "description": "Workspace agents in ChatGPT are Codex-powered agents that automate complex workflows, run in the cloud, and help teams scale work across tools securely.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "e4ace5ec79"
    },
    {
      "id": "huggingface|https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "deepseek-ai/DeepSeek-V4-Pro · 3391♥ · 381587 dl",
      "headline": "DeepSeek-V4-Pro is a text-generation model on HuggingFace with 381,587 downloads and 3,391 likes.",
      "url": "https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro",
      "published_at": "2026-04-22T06:04:45Z",
      "description": "text-generation",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:28:31Z",
      "anchor": "b64a7c8282"
    },
    {
      "id": "huggingface|https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "deepseek-ai/DeepSeek-V4-Flash · 917♥ · 345885 dl",
      "headline": "DeepSeek-V4-Flash, a text-generation model, achieved 345,885 downloads on HuggingFace.",
      "url": "https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash",
      "published_at": "2026-04-22T06:04:20Z",
      "description": "text-generation",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:28:42Z",
      "anchor": "8f8d8bae98"
    },
    {
      "id": "huggingface|https://huggingface.co/unsloth/Qwen3.6-27B-GGUF",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "unsloth/Qwen3.6-27B-GGUF · 537♥ · 983535 dl",
      "headline": "HuggingFace released Qwen 3.6 27B GGUF, a quantized multimodal model supporting image-text-to-text with 983K downloads.",
      "url": "https://huggingface.co/unsloth/Qwen3.6-27B-GGUF",
      "published_at": "2026-04-22T05:43:37Z",
      "description": "image-text-to-text",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:28:55Z",
      "anchor": "c4c0bc869a"
    },
    {
      "id": "huggingface|https://huggingface.co/sensenova/SenseNova-U1-8B-MoT",
      "anchor": "e94bae95ae",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "sensenova/SenseNova-U1-8B-MoT · 119♥ · 1308 dl",
      "headline": "SenseNova releases SenseNova-U1-8B-MoT, an any-to-any model on HuggingFace.",
      "url": "https://huggingface.co/sensenova/SenseNova-U1-8B-MoT",
      "published_at": "2026-04-22T04:43:54Z",
      "discovered_at": "2026-05-03T01:20:39Z",
      "description": "any-to-any",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-03T01:20:57Z"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-openai-launches-gpt-image",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] OpenAI launches GPT-Image-2",
      "headline": "[AINews] OpenAI launches GPT-Image-2",
      "url": "https://www.latent.space/p/ainews-openai-launches-gpt-image",
      "published_at": "2026-04-22T00:23:52Z",
      "description": "with Cursor getting a $10B contract with xAI and a right to acquire for $60B.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "88fac50b95"
    },
    {
      "id": "openai|https://openai.com/index/introducing-openai-privacy-filter",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Introducing OpenAI Privacy Filter",
      "headline": "Introducing OpenAI Privacy Filter",
      "url": "https://openai.com/index/introducing-openai-privacy-filter",
      "published_at": "2026-04-22T00:00:00Z",
      "description": "OpenAI Privacy Filter is an open-weight model for detecting and redacting personally identifiable information (PII) in text with state-of-the-art accuracy",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "7f3b39acee"
    },
    {
      "id": "mistral|https://mistral.ai/news/two-users-one-cli-people-and-agents",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Two users, one CLI: people and agents",
      "headline": "Two users, one CLI: people and agents",
      "url": "https://mistral.ai/news/two-users-one-cli-people-and-agents",
      "published_at": "2026-04-21T20:32:48Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "56b6f25fb1"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/partnering-with-industry-leaders-to-accelerate-ai-transformation",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Partnering with industry leaders to accelerate AI transformation",
      "headline": "Partnering with industry leaders to accelerate AI transformation",
      "url": "https://deepmind.google/blog/partnering-with-industry-leaders-to-accelerate-ai-transformation",
      "published_at": "2026-04-21T14:54:15Z",
      "description": "Google DeepMind partners with global consultancies to bring the power of frontier AI to organizations around the world.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9090173467"
    },
    {
      "id": "openai|https://openai.com/index/introducing-chatgpt-images-2-0",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Introducing ChatGPT Images 2.0",
      "headline": "Introducing ChatGPT Images 2.0",
      "url": "https://openai.com/index/introducing-chatgpt-images-2-0",
      "published_at": "2026-04-21T12:00:00Z",
      "description": "ChatGPT Images 2.0 introduces a state-of-the-art image generation model with improved text rendering, multilingual support, and advanced visual reasoning.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "751b28d668"
    },
    {
      "id": "huggingface|https://huggingface.co/Qwen/Qwen3.6-27B",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "Qwen/Qwen3.6-27B · 1063♥ · 1070778 dl",
      "headline": "HuggingFace's Qwen3.6-27B multimodal model for image-text-to-text tasks surpasses 1 million downloads.",
      "url": "https://huggingface.co/Qwen/Qwen3.6-27B",
      "published_at": "2026-04-21T07:50:43Z",
      "description": "image-text-to-text",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:29:04Z",
      "anchor": "9edd5d53e3"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-moonshot-kimi-k26-the-worlds",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] Moonshot Kimi K2.6: the world's leading Open Model refreshes to catch up to Opus 4.6 (ahead of DeepSeek v4?)",
      "headline": "[AINews] Moonshot Kimi K2.6: the world's leading Open Model refreshes to catch up to Opus 4.6 (ahead of DeepSeek v4?)",
      "url": "https://www.latent.space/p/ainews-moonshot-kimi-k26-the-worlds",
      "published_at": "2026-04-21T00:19:33Z",
      "description": "Yay Kimi!!!",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b12f2d221f"
    },
    {
      "id": "openai|https://openai.com/index/scaling-codex-to-enterprises-worldwide",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Scaling Codex to enterprises worldwide",
      "headline": "Scaling Codex to enterprises worldwide",
      "url": "https://openai.com/index/scaling-codex-to-enterprises-worldwide",
      "published_at": "2026-04-21T00:00:00Z",
      "description": "OpenAI launches Codex Labs, partners with with Accenture, PwC, Infosys, and others to help enterprises deploy and scale Codex across the software development lifecycle, and hits 4M Codex WAU.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "076f92d370"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/reading-todays-open-closed-performance",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Reading today's open-closed performance gap",
      "headline": "Reading today's open-closed performance gap",
      "url": "https://www.interconnects.ai/p/reading-todays-open-closed-performance",
      "published_at": "2026-04-20T18:25:02Z",
      "description": "The complex factors that determine the single evaluation number so many focus on. Plus, how this changes in the future.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "25a98faef7"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/claude-opus-4-7",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Introducing Claude Opus 4.7",
      "headline": "Introducing Claude Opus 4.7",
      "url": "https://www.anthropic.com/news/claude-opus-4-7",
      "published_at": "2026-04-20T17:27:26Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "7445f5ca44"
    },
    {
      "id": "latentspace|https://www.latent.space/p/noetik",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik",
      "headline": "🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik",
      "url": "https://www.latent.space/p/noetik",
      "published_at": "2026-04-20T16:17:17Z",
      "description": "95% of cancer treatments fail to pass clinical trials, but it may be a matching problem — that Noetik is solving with autoregressive transformers like TARIO-2!",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "eea19ed9c6"
    },
    {
      "id": "huggingface|https://huggingface.co/talkie-lm/talkie-1930-13b-it",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "talkie-lm/talkie-1930-13b-it · 194♥ · 0 dl",
      "headline": "Talkie-1930-13B-IT, a 13 billion parameter Italian language model, was published on HuggingFace.",
      "url": "https://huggingface.co/talkie-lm/talkie-1930-13b-it",
      "published_at": "2026-04-20T10:43:41Z",
      "description": "",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:29:14Z",
      "anchor": "999e3df2bf"
    },
    {
      "id": "huggingface|https://huggingface.co/nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 · 192♥ · 37418 dl",
      "headline": "NVIDIA releases Nemotron-3-Nano-Omni-30B, a 30-billion-parameter model enabling any-to-any multimodal conversions.",
      "url": "https://huggingface.co/nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16",
      "published_at": "2026-04-20T04:40:42Z",
      "description": "any-to-any",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:29:30Z",
      "anchor": "a69f567a8f"
    },
    {
      "id": "openai|https://openai.com/index/hyatt-advances-ai-with-chatgpt-enterprise",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "OpenAI helps Hyatt advance AI among colleagues",
      "headline": "OpenAI helps Hyatt advance AI among colleagues",
      "url": "https://openai.com/index/hyatt-advances-ai-with-chatgpt-enterprise",
      "published_at": "2026-04-20T00:00:00Z",
      "description": "Hyatt deploys ChatGPT Enterprise across its global workforce, using GPT-5.4 and Codex to improve productivity, operations, and guest experiences.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9c9984ae5e"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-the-two-sides-of-openclaw",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] The Two Sides of OpenClaw",
      "headline": "[AINews] The Two Sides of OpenClaw",
      "url": "https://www.latent.space/p/ainews-the-two-sides-of-openclaw",
      "published_at": "2026-04-18T06:50:57Z",
      "description": "a quiet day lets us reflect on openclaw this week.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6e494d6b68"
    },
    {
      "id": "huggingface|https://huggingface.co/openai/privacy-filter",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "openai/privacy-filter · 1196♥ · 99399 dl",
      "headline": "OpenAI's privacy-filter token-classification model on HuggingFace has 1,196 likes and 99,399 downloads.",
      "url": "https://huggingface.co/openai/privacy-filter",
      "published_at": "2026-04-17T21:38:35Z",
      "description": "token-classification",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:29:41Z",
      "anchor": "3e1d8e8260"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/claude-design-anthropic-labs",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Introducing Claude Design by Anthropic Labs",
      "headline": "Introducing Claude Design by Anthropic Labs",
      "url": "https://www.anthropic.com/news/claude-design-anthropic-labs",
      "published_at": "2026-04-17T14:58:21Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "baef59ab97"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-anthropic-claude-opus-47-literally",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension",
      "headline": "[AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension",
      "url": "https://www.latent.space/p/ainews-anthropic-claude-opus-47-literally",
      "published_at": "2026-04-17T01:36:17Z",
      "description": "The new SOTA model asserts its dominance.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "e0bf82d936"
    },
    {
      "id": "xai|https://x.ai/news/grok-stt-and-tts-apis",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Speech to Text and Text to Speech APIs",
      "headline": "Grok Speech to Text and Text to Speech APIs",
      "url": "https://x.ai/news/grok-stt-and-tts-apis",
      "published_at": "2026-04-17T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a690c89f52"
    },
    {
      "id": "openai|https://openai.com/index/codex-for-almost-everything",
      "source_id": "openai",
      "source_name": "OpenAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "openai"
      ],
      "title": "Codex for (almost) everything",
      "headline": "Codex for (almost) everything",
      "url": "https://openai.com/index/codex-for-almost-everything",
      "published_at": "2026-04-16T10:00:00Z",
      "description": "The updated Codex app for macOS and Windows adds computer use, in-app browsing, image generation, memory, and plugins to accelerate developer workflows.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d12cebff5a"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-rip-pull-requests-2005-2026",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] RIP Pull Requests (2005-2026)",
      "headline": "[AINews] RIP Pull Requests (2005-2026)",
      "url": "https://www.latent.space/p/ainews-rip-pull-requests-2005-2026",
      "published_at": "2026-04-16T06:41:12Z",
      "description": "a quiet day lets us report on the death of the pull requests",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "c1151871dc"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/my-bets-on-open-models-mid-2026",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "My bets on open models, mid-2026",
      "headline": "My bets on open models, mid-2026",
      "url": "https://www.interconnects.ai/p/my-bets-on-open-models-mid-2026",
      "published_at": "2026-04-15T18:20:00Z",
      "description": "What I expect to come next and why, focused on the open-closed gap.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "c968dcbf0a"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-3-1-flash-tts-the-next-generation-of-expressive-ai-speech",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini 3.1 Flash TTS: the next generation of expressive AI speech",
      "headline": "Gemini 3.1 Flash TTS: the next generation of expressive AI speech",
      "url": "https://deepmind.google/blog/gemini-3-1-flash-tts-the-next-generation-of-expressive-ai-speech",
      "published_at": "2026-04-15T16:03:19Z",
      "description": "Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6e0455fefa"
    },
    {
      "id": "anthropic|https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Equipping agents for the real world with Agent Skills",
      "headline": "Equipping agents for the real world with Agent Skills",
      "url": "https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills",
      "published_at": "2026-04-15T14:39:01Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0834b556d0"
    },
    {
      "id": "mistral|https://mistral.ai/news/connectors",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Connect the dots: Build with built-in and custom MCPs in Studio",
      "headline": "Connect the dots: Build with built-in and custom MCPs in Studio",
      "url": "https://mistral.ai/news/connectors",
      "published_at": "2026-04-15T12:53:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "574214117f"
    },
    {
      "id": "huggingface|https://huggingface.co/Qwen/Qwen3.6-35B-A3B",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "Qwen/Qwen3.6-35B-A3B · 1559♥ · 2397446 dl",
      "headline": "Qwen releases Qwen3.6-35B-A3B image-text-to-text model on HuggingFace with 2.4M downloads.",
      "url": "https://huggingface.co/Qwen/Qwen3.6-35B-A3B",
      "published_at": "2026-04-15T05:59:19Z",
      "description": "image-text-to-text",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:42:21Z",
      "anchor": "ba5f352c75"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-humanitys-last-gasp",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] Humanity's Last Gasp",
      "headline": "[AINews] Humanity's Last Gasp",
      "url": "https://www.latent.space/p/ainews-humanitys-last-gasp",
      "published_at": "2026-04-15T03:05:54Z",
      "description": "a quiet day lets us reflect on work in the time of AI",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d0ffe29f19"
    },
    {
      "id": "latentspace|https://www.latent.space/p/notion",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "Notion’s Token Town: 5 Rebuilds, 100+ Tools, MCP vs CLIs and the Software Factory Future — Simon Last & Sarah Sachs of Notion",
      "headline": "Notion’s Token Town: 5 Rebuilds, 100+ Tools, MCP vs CLIs and the Software Factory Future — Simon Last & Sarah Sachs of Notion",
      "url": "https://www.latent.space/p/notion",
      "published_at": "2026-04-15T00:31:14Z",
      "description": "Notion's cofounder and head of AI peel back the curtains to talk about finally shipping the Knowledge Work AI agents the world has been waiting for.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0ecc976620"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/what-ive-been-building-atom-report",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "What I’ve been building: ATOM Report, post-training course, finishing my book, and ongoing research",
      "headline": "What I’ve been building: ATOM Report, post-training course, finishing my book, and ongoing research",
      "url": "https://www.interconnects.ai/p/what-ive-been-building-atom-report",
      "published_at": "2026-04-14T20:41:12Z",
      "description": "What I've been up to!",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8759eed3fb"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/automated-alignment-researchers",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Automated Alignment Researchers: Using large language models to scale scalable oversight",
      "headline": "Automated Alignment Researchers: Using large language models to scale scalable oversight",
      "url": "https://www.anthropic.com/research/automated-alignment-researchers",
      "published_at": "2026-04-14T19:00:21Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "4e0fe0a497"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/narasimhan-board",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors",
      "headline": "Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors",
      "url": "https://www.anthropic.com/news/narasimhan-board",
      "published_at": "2026-04-14T14:00:04Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6eb5a42778"
    },
    {
      "id": "latentspace|https://www.latent.space/p/ainews-top-local-models-list-april",
      "source_id": "latentspace",
      "source_name": "Latent Space",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "latentspace"
      ],
      "title": "[AINews] Top Local Models List - April 2026",
      "headline": "[AINews] Top Local Models List - April 2026",
      "url": "https://www.latent.space/p/ainews-top-local-models-list-april",
      "published_at": "2026-04-14T08:43:33Z",
      "description": "a quiet day lets us check in on the local models scene",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5a0144ed11"
    },
    {
      "id": "huggingface|https://huggingface.co/moonshotai/Kimi-K2.6",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "moonshotai/Kimi-K2.6 · 1175♥ · 699348 dl",
      "headline": "Moonshotai's Kimi-K2.6, an image-text-to-text model, has gained 1,175 likes and 699,348 downloads on HuggingFace.",
      "url": "https://huggingface.co/moonshotai/Kimi-K2.6",
      "published_at": "2026-04-14T04:23:36Z",
      "description": "image-text-to-text",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:42:30Z",
      "anchor": "5e8d5f2606"
    },
    {
      "id": "anthropic|https://www.anthropic.com/engineering/building-effective-agents",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Building Effective AI Agents",
      "headline": "Building Effective AI Agents",
      "url": "https://www.anthropic.com/engineering/building-effective-agents",
      "published_at": "2026-04-13T17:46:47Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "2d715b77fb"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-robotics-er-1-6",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning",
      "headline": "Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning",
      "url": "https://deepmind.google/blog/gemini-robotics-er-1-6",
      "published_at": "2026-04-13T15:52:13Z",
      "description": "Gemini Robotics ER 1.6: Enhancing spatial reasoning and multi-view understanding for autonomous robotics.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "e04787588d"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/the-inevitable-need-for-an-open-model",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "The inevitable need for an open model consortium",
      "headline": "The inevitable need for an open model consortium",
      "url": "https://www.interconnects.ai/p/the-inevitable-need-for-an-open-model",
      "published_at": "2026-04-11T13:02:06Z",
      "description": "And yes, I hate consortia too.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0835df6bda"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/claude-for-financial-services",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Claude for Financial Services",
      "headline": "Claude for Financial Services",
      "url": "https://www.anthropic.com/news/claude-for-financial-services",
      "published_at": "2026-04-10T12:11:09Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "edd4f0ed72"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/claude-mythos-and-misguided-open",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Claude Mythos and misguided open-weight fearmongering",
      "headline": "Claude Mythos and misguided open-weight fearmongering",
      "url": "https://www.interconnects.ai/p/claude-mythos-and-misguided-open",
      "published_at": "2026-04-09T21:28:39Z",
      "description": "Another dance around fears of open-source.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6c722e1408"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/healthcare-life-sciences",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Advancing Claude in healthcare and the life sciences",
      "headline": "Advancing Claude in healthcare and the life sciences",
      "url": "https://www.anthropic.com/news/healthcare-life-sciences",
      "published_at": "2026-04-09T20:25:31Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "577ab1c8d5"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/trustworthy-agents",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Trustworthy agents in practice",
      "headline": "Trustworthy agents in practice",
      "url": "https://www.anthropic.com/research/trustworthy-agents",
      "published_at": "2026-04-09T17:00:08Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "24804338fc"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/labor-market-impacts",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Labor market impacts of AI: A new measure and early evidence",
      "headline": "Labor market impacts of AI: A new measure and early evidence",
      "url": "https://www.anthropic.com/research/labor-market-impacts",
      "published_at": "2026-04-08T14:10:47Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "002376e0ca"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/google-broadcom-partnership-compute",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute",
      "headline": "Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute",
      "url": "https://www.anthropic.com/news/google-broadcom-partnership-compute",
      "published_at": "2026-04-06T21:43:38Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "682060a33d"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/diff-tool",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "A “diff” tool for AI: Finding behavioral differences in new models",
      "headline": "A “diff” tool for AI: Finding behavioral differences in new models",
      "url": "https://www.anthropic.com/research/diff-tool",
      "published_at": "2026-04-03T21:02:41Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "930d44c660"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/gemma-4-and-what-makes-an-open-model",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Gemma 4 and what makes an open model succeed",
      "headline": "Gemma 4 and what makes an open model succeed",
      "url": "https://www.interconnects.ai/p/gemma-4-and-what-makes-an-open-model",
      "published_at": "2026-04-03T16:57:36Z",
      "description": "Hint: it's not benchmark scores.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "83eb602f3f"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemma-4-byte-for-byte-the-most-capable-open-models",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemma 4: Byte for byte, the most capable open models",
      "headline": "Gemma 4: Byte for byte, the most capable open models",
      "url": "https://deepmind.google/blog/gemma-4-byte-for-byte-the-most-capable-open-models",
      "published_at": "2026-04-02T16:00:49Z",
      "description": "Gemma 4: Our most intelligent open models to date, purpose-built for advanced reasoning and agentic workflows.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5d78ca4771"
    },
    {
      "id": "anthropic|https://www.anthropic.com/news/australia-MOU",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Australian government and Anthropic sign MOU for AI safety and research",
      "headline": "Australian government and Anthropic sign MOU for AI safety and research",
      "url": "https://www.anthropic.com/news/australia-MOU",
      "published_at": "2026-03-31T23:16:54Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "089c240ea5"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/how-australia-uses-claude",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "How Australia Uses Claude: Findings from the Anthropic Economic Index",
      "headline": "How Australia Uses Claude: Findings from the Anthropic Economic Index",
      "url": "https://www.anthropic.com/research/how-australia-uses-claude",
      "published_at": "2026-03-31T22:35:47Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "1e22c387e5"
    },
    {
      "id": "huggingface|https://huggingface.co/mistralai/Mistral-Medium-3.5-128B",
      "source_id": "huggingface",
      "source_name": "HuggingFace",
      "category": "code",
      "category_emoji": "💻",
      "hashtags": [
        "code",
        "huggingface"
      ],
      "title": "mistralai/Mistral-Medium-3.5-128B · 212♥ · 8492 dl",
      "headline": "Mistral-Medium-3.5-128B, a 128-billion parameter model from Mistral AI, is available on HuggingFace with 8,492 downloads.",
      "url": "https://huggingface.co/mistralai/Mistral-Medium-3.5-128B",
      "published_at": "2026-03-31T09:50:20Z",
      "description": "",
      "summarize": true,
      "summarized": true,
      "summarized_at": "2026-05-02T13:42:40Z",
      "anchor": "8fc5ebf033"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/latest-open-artifacts-20-new-orgs",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Latest open artifacts (#20): New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others",
      "headline": "Latest open artifacts (#20): New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others",
      "url": "https://www.interconnects.ai/p/latest-open-artifacts-20-new-orgs",
      "published_at": "2026-03-30T13:02:45Z",
      "description": "New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "1961b44a41"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-3-1-flash-live-making-audio-ai-more-natural-and-reliable",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini 3.1 Flash Live: Making audio AI more natural and reliable",
      "headline": "Gemini 3.1 Flash Live: Making audio AI more natural and reliable",
      "url": "https://deepmind.google/blog/gemini-3-1-flash-live-making-audio-ai-more-natural-and-reliable",
      "published_at": "2026-03-26T15:23:35Z",
      "description": "Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0fb8949769"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/project-fetch-robot-dog",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Project Fetch: Can Claude train a robot dog?",
      "headline": "Project Fetch: Can Claude train a robot dog?",
      "url": "https://www.anthropic.com/research/project-fetch-robot-dog",
      "published_at": "2026-03-26T11:07:12Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6f90f0290e"
    },
    {
      "id": "anthropic|https://www.anthropic.com/research/project-vend-2",
      "source_id": "anthropic",
      "source_name": "Anthropic",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "anthropic"
      ],
      "title": "Project Vend: Phase two",
      "headline": "Project Vend: Phase two",
      "url": "https://www.anthropic.com/research/project-vend-2",
      "published_at": "2026-03-26T11:07:02Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b9e3ecc62b"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/protecting-people-from-harmful-manipulation",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Protecting people from harmful manipulation",
      "headline": "Protecting people from harmful manipulation",
      "url": "https://deepmind.google/blog/protecting-people-from-harmful-manipulation",
      "published_at": "2026-03-25T16:46:20Z",
      "description": "Google DeepMind researches AI's harmful manipulation risks across areas like finance and health, leading to new safety measures.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3d4d1ae313"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/lyria-3-pro-create-longer-tracks-in-more",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Lyria 3 Pro: Create longer tracks in more",
      "headline": "Lyria 3 Pro: Create longer tracks in more",
      "url": "https://deepmind.google/blog/lyria-3-pro-create-longer-tracks-in-more",
      "published_at": "2026-03-25T16:01:39Z",
      "description": "Introducing Lyria 3 Pro, which unlocks longer tracks with structural awareness. We’re also bringing Lyria to more Google products and surfaces.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "fabacec91f"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/lossy-self-improvement",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Lossy self-improvement",
      "headline": "Lossy self-improvement",
      "url": "https://www.interconnects.ai/p/lossy-self-improvement",
      "published_at": "2026-03-22T19:39:40Z",
      "description": "The case for why self-improvement is real but it doesn't lead to fast takeoff.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "28635a38d9"
    },
    {
      "id": "mistral|https://mistral.ai/news/leanstral",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Leanstral: Open-Source foundation for trustworthy vibe-coding",
      "headline": "Leanstral: Open-Source foundation for trustworthy vibe-coding",
      "url": "https://mistral.ai/news/leanstral",
      "published_at": "2026-03-19T12:18:37Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d1ac364b57"
    },
    {
      "id": "mistral|https://mistral.ai/news/forge",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Introducing Forge",
      "headline": "Introducing Forge",
      "url": "https://mistral.ai/news/forge",
      "published_at": "2026-03-18T17:03:13Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "63c3ecc6c0"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/gpt-54-is-a-big-step-for-codex",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "GPT 5.4 is a big step for Codex",
      "headline": "GPT 5.4 is a big step for Codex",
      "url": "https://www.interconnects.ai/p/gpt-54-is-a-big-step-for-codex",
      "published_at": "2026-03-18T13:02:54Z",
      "description": "On evaluating and understanding the frontier of agents, and why I still turn to Claude.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6722232865"
    },
    {
      "id": "mistral|https://mistral.ai/news/mistral-small-4",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Introducing Mistral Small 4",
      "headline": "Introducing Mistral Small 4",
      "url": "https://mistral.ai/news/mistral-small-4",
      "published_at": "2026-03-17T20:05:07Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ac2fad0360"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/measuring-progress-toward-agi-a-cognitive-framework",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Measuring progress toward AGI: A cognitive framework",
      "headline": "Measuring progress toward AGI: A cognitive framework",
      "url": "https://deepmind.google/blog/measuring-progress-toward-agi-a-cognitive-framework",
      "published_at": "2026-03-17T16:03:47Z",
      "description": "We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "4d052044bf"
    },
    {
      "id": "mistral|https://mistral.ai/news/voxtral-transcribe-2",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Voxtral transcribes at the speed of sound.",
      "headline": "Voxtral transcribes at the speed of sound.",
      "url": "https://mistral.ai/news/voxtral-transcribe-2",
      "published_at": "2026-03-16T19:16:36Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "dd77034ee9"
    },
    {
      "id": "mistral|https://mistral.ai/news/mistral-ai-and-nvidia-partner-to-accelerate-open-frontier-models",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Mistral AI partners with NVIDIA to accelerate open frontier models",
      "headline": "Mistral AI partners with NVIDIA to accelerate open frontier models",
      "url": "https://mistral.ai/news/mistral-ai-and-nvidia-partner-to-accelerate-open-frontier-models",
      "published_at": "2026-03-16T19:03:42Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "96143347a9"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/the-next-phase-of-open-models",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "What comes next with open models",
      "headline": "What comes next with open models",
      "url": "https://www.interconnects.ai/p/the-next-phase-of-open-models",
      "published_at": "2026-03-16T13:00:51Z",
      "description": "Markets, capabilities, cope, and bewilderment in the industrialization of language models.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "7b79220d4a"
    },
    {
      "id": "mistral|https://mistral.ai/news/rails-testing-on-autopilot-building-an-agent-that-writes-what-developers-wont",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Rails testing on autopilot: Building an agent that writes what developers won't",
      "headline": "Rails testing on autopilot: Building an agent that writes what developers won't",
      "url": "https://mistral.ai/news/rails-testing-on-autopilot-building-an-agent-that-writes-what-developers-wont",
      "published_at": "2026-03-13T21:06:44Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a140e87373"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/10-years-of-alphago",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "From games to biology and beyond: 10 years of AlphaGo’s impact",
      "headline": "From games to biology and beyond: 10 years of AlphaGo’s impact",
      "url": "https://deepmind.google/blog/10-years-of-alphago",
      "published_at": "2026-03-09T13:52:36Z",
      "description": "Ten years since AlphaGo, we explore how it is catalyzing scientific discovery and paving a path to AGI.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "99619c97be"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/how-anthropic-vs-dow-impacts-open",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Dean Ball on open models and government control",
      "headline": "Dean Ball on open models and government control",
      "url": "https://www.interconnects.ai/p/how-anthropic-vs-dow-impacts-open",
      "published_at": "2026-03-06T14:03:27Z",
      "description": "Subtle precedents on the future of open models set by the unfolding Anthropic v. Department of War case.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0049d4bb78"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/olmo-hybrid-and-future-llm-architectures",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Olmo Hybrid and future LLM architectures",
      "headline": "Olmo Hybrid and future LLM architectures",
      "url": "https://www.interconnects.ai/p/olmo-hybrid-and-future-llm-architectures",
      "published_at": "2026-03-05T16:16:44Z",
      "description": "The latest Olmo model and discussions at the frontier of open-source post training tools.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "063ef2d44a"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-3-1-flash-lite-built-for-intelligence-at-scale",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini 3.1 Flash-Lite: Built for intelligence at scale",
      "headline": "Gemini 3.1 Flash-Lite: Built for intelligence at scale",
      "url": "https://deepmind.google/blog/gemini-3-1-flash-lite-built-for-intelligence-at-scale",
      "published_at": "2026-03-03T16:35:55Z",
      "description": "Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8e9e76d041"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/latest-open-artifacts-19-qwen-35",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 — Chinese labs' latest push of the frontier",
      "headline": "Latest open artifacts (#19): Qwen 3.5, GLM 5, MiniMax 2.5 — Chinese labs' latest push of the frontier",
      "url": "https://www.interconnects.ai/p/latest-open-artifacts-19-qwen-35",
      "published_at": "2026-03-03T16:30:59Z",
      "description": "Welcome to the year of the horse!",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b90c7d4851"
    },
    {
      "id": "mistral|https://mistral.ai/news/pixtral-12b",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "[Deprecated] Pixtral 12B",
      "headline": "[Deprecated] Pixtral 12B",
      "url": "https://mistral.ai/news/pixtral-12b",
      "published_at": "2026-03-02T10:38:40Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "7372d9fca9"
    },
    {
      "id": "mistral|https://mistral.ai/news/pixtral-large",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "[Deprecated] Pixtral Large",
      "headline": "[Deprecated] Pixtral Large",
      "url": "https://mistral.ai/news/pixtral-large",
      "published_at": "2026-03-01T08:46:58Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5aaaed36c4"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/nano-banana-2-combining-pro-capabilities-with-lightning-fast-speed",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Nano Banana 2: Combining Pro capabilities with lightning-fast speed",
      "headline": "Nano Banana 2: Combining Pro capabilities with lightning-fast speed",
      "url": "https://deepmind.google/blog/nano-banana-2-combining-pro-capabilities-with-lightning-fast-speed",
      "published_at": "2026-02-26T16:01:50Z",
      "description": "Our latest image generation model offers advanced world knowledge, production ready specs, subject consistency and more, all at Flash speed.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "92cab239f5"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/how-much-does-distillation-really",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "How much does distillation really matter for Chinese LLMs?",
      "headline": "How much does distillation really matter for Chinese LLMs?",
      "url": "https://www.interconnects.ai/p/how-much-does-distillation-really",
      "published_at": "2026-02-24T16:06:43Z",
      "description": "Reacting to Anthropic's post on \"distillation attacks.\"",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "bb5ec20a0b"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-3-1-pro-a-smarter-model-for-your-most-complex-tasks",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini 3.1 Pro: A smarter model for your most complex tasks",
      "headline": "Gemini 3.1 Pro: A smarter model for your most complex tasks",
      "url": "https://deepmind.google/blog/gemini-3-1-pro-a-smarter-model-for-your-most-complex-tasks",
      "published_at": "2026-02-19T16:06:14Z",
      "description": "3.1 Pro is designed for tasks where a simple answer isn’t enough.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "68180c7c9d"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/a-new-way-to-express-yourself-gemini-can-now-create-music",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "A new way to express yourself: Gemini can now create music",
      "headline": "A new way to express yourself: Gemini can now create music",
      "url": "https://deepmind.google/blog/a-new-way-to-express-yourself-gemini-can-now-create-music",
      "published_at": "2026-02-18T16:01:38Z",
      "description": "The Gemini app now features our most advanced music generation model Lyria 3, empowering anyone to make 30-second tracks using text or images.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "bde589b0f7"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/open-models-in-perpetual-catch-up",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Open models in perpetual catch-up",
      "headline": "Open models in perpetual catch-up",
      "url": "https://www.interconnects.ai/p/open-models-in-perpetual-catch-up",
      "published_at": "2026-02-17T17:27:36Z",
      "description": "The open-closed gap, distillation, innovation timescales, how open models win, specialized models, what’s missing, etc.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b73bf415ba"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/accelerating-discovery-in-india-through-ai-powered-science-and-education",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Accelerating discovery in India through AI-powered science and education",
      "headline": "Accelerating discovery in India through AI-powered science and education",
      "url": "https://deepmind.google/blog/accelerating-discovery-in-india-through-ai-powered-science-and-education",
      "published_at": "2026-02-17T13:42:20Z",
      "description": "Google DeepMind brings National Partnerships for AI initiative to India, scaling AI for science and education",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "97a044b852"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-3-deep-think-advancing-science-research-and-engineering",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini 3 Deep Think: Advancing science, research and engineering",
      "headline": "Gemini 3 Deep Think: Advancing science, research and engineering",
      "url": "https://deepmind.google/blog/gemini-3-deep-think-advancing-science-research-and-engineering",
      "published_at": "2026-02-12T16:15:09Z",
      "description": "Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ab5bef8707"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Accelerating Mathematical and Scientific Discovery with Gemini Deep Think",
      "headline": "Accelerating Mathematical and Scientific Discovery with Gemini Deep Think",
      "url": "https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think",
      "published_at": "2026-02-09T16:12:06Z",
      "description": "Research papers point to the growing impact of Deep Think across fields",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b331898673"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/opus-46-vs-codex-53",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Opus 4.6, Codex 5.3, and the post-benchmark era",
      "headline": "Opus 4.6, Codex 5.3, and the post-benchmark era",
      "url": "https://www.interconnects.ai/p/opus-46-vs-codex-53",
      "published_at": "2026-02-09T14:03:12Z",
      "description": "On comparing models in 2026.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "a08c5d72d6"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/why-nvidia-builds-open-models-with",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Why Nvidia builds open models with Bryan Catanzaro",
      "headline": "Why Nvidia builds open models with Bryan Catanzaro",
      "url": "https://www.interconnects.ai/p/why-nvidia-builds-open-models-with",
      "published_at": "2026-02-04T18:00:28Z",
      "description": "Interconnects interview #17 on the past, present, and future of the Nemotron project.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d2e9714c56"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/latest-open-artifacts-18-arcees-big",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Latest open artifacts (#18): Arcee's 400B MoE, LiquidAI's underrated 1B model, new Kimi, and anticipation of a busy month",
      "headline": "Latest open artifacts (#18): Arcee's 400B MoE, LiquidAI's underrated 1B model, new Kimi, and anticipation of a busy month",
      "url": "https://www.interconnects.ai/p/latest-open-artifacts-18-arcees-big",
      "published_at": "2026-02-02T13:03:33Z",
      "description": "Tons of useful \"niche\" models and anticipation of big releases coming soon.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5bdd5e8396"
    },
    {
      "id": "xai|https://x.ai/news/xai-joins-spacex",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "xAI joins SpaceX",
      "headline": "xAI joins SpaceX",
      "url": "https://x.ai/news/xai-joins-spacex",
      "published_at": "2026-02-02T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "171dece929"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/thoughts-on-the-hiring-market-in",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Thoughts on the job market in the age of LLMs",
      "headline": "Thoughts on the job market in the age of LLMs",
      "url": "https://www.interconnects.ai/p/thoughts-on-the-hiring-market-in",
      "published_at": "2026-01-30T15:49:25Z",
      "description": "On standing out and finding gems.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "31b1a6c782"
    },
    {
      "id": "mistral|https://mistral.ai/news/mistral-vibe-2-0",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Terminally online Mistral Vibe.",
      "headline": "Terminally online Mistral Vibe.",
      "url": "https://mistral.ai/news/mistral-vibe-2-0",
      "published_at": "2026-01-29T23:35:23Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "159e507b87"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/project-genie-experimenting-with-infinite-interactive-worlds",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Project Genie: Experimenting with infinite, interactive worlds",
      "headline": "Project Genie: Experimenting with infinite, interactive worlds",
      "url": "https://deepmind.google/blog/project-genie-experimenting-with-infinite-interactive-worlds",
      "published_at": "2026-01-29T17:01:05Z",
      "description": "Google AI Ultra subscribers in the U.S. can try out Project Genie, an experimental research prototype that lets you create and explore worlds.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6ffea3fd75"
    },
    {
      "id": "xai|https://x.ai/news/grok-imagine-api",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Imagine API",
      "headline": "Grok Imagine API",
      "url": "https://x.ai/news/grok-imagine-api",
      "published_at": "2026-01-28T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3b2f59dedd"
    },
    {
      "id": "interconnects|https://www.interconnects.ai/p/arcee-ai-goes-all-in-on-open-models",
      "source_id": "interconnects",
      "source_name": "Interconnects",
      "category": "writing",
      "category_emoji": "✍️",
      "hashtags": [
        "writing",
        "interconnects"
      ],
      "title": "Arcee AI goes all-in on open models built in the U.S.",
      "headline": "Arcee AI goes all-in on open models built in the U.S.",
      "url": "https://www.interconnects.ai/p/arcee-ai-goes-all-in-on-open-models",
      "published_at": "2026-01-27T22:47:24Z",
      "description": "Interconnects interview #16 to celebrate the release of Trinity Large.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "31575f3a39"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/d4rt-teaching-ai-to-see-the-world-in-four-dimensions",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "D4RT: Teaching AI to see the world in four dimensions",
      "headline": "D4RT: Teaching AI to see the world in four dimensions",
      "url": "https://deepmind.google/blog/d4rt-teaching-ai-to-see-the-world-in-four-dimensions",
      "published_at": "2026-01-16T10:39:00Z",
      "description": "D4RT: Unified, efficient 4D reconstruction and tracking up to 300x faster than prior methods.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "92312597ad"
    },
    {
      "id": "mistral|https://mistral.ai/news/mistral-3",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Introducing Mistral 3",
      "headline": "Introducing Mistral 3",
      "url": "https://mistral.ai/news/mistral-3",
      "published_at": "2026-01-14T21:57:27Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5eb5b435b2"
    },
    {
      "id": "mistral|https://mistral.ai/news/agents-api",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Build AI agents with the Mistral Agents API",
      "headline": "Build AI agents with the Mistral Agents API",
      "url": "https://mistral.ai/news/agents-api",
      "published_at": "2026-01-14T11:04:13Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3c7eb6391b"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/veo-3-1-ingredients-to-video-more-consistency-creativity-and-control",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Veo 3.1 Ingredients to Video: More consistency, creativity and control",
      "headline": "Veo 3.1 Ingredients to Video: More consistency, creativity and control",
      "url": "https://deepmind.google/blog/veo-3-1-ingredients-to-video-more-consistency-creativity-and-control",
      "published_at": "2026-01-13T17:00:18Z",
      "description": "Our latest Veo update generates lively, dynamic clips that feel natural and engaging — and supports vertical video generation.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9dc93c8d5a"
    },
    {
      "id": "mistral|https://mistral.ai/news/llm-as-rag-judge",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Evaluating RAG with LLM as a Judge",
      "headline": "Evaluating RAG with LLM as a Judge",
      "url": "https://mistral.ai/news/llm-as-rag-judge",
      "published_at": "2026-01-13T10:40:29Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "18f6b693bc"
    },
    {
      "id": "xai|https://x.ai/news/series-e",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "xAI Raises $20B Series E",
      "headline": "xAI Raises $20B Series E",
      "url": "https://x.ai/news/series-e",
      "published_at": "2026-01-06T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "55bf7a6b44"
    },
    {
      "id": "xai|https://x.ai/news/grok-business",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Introducing Grok Business and Grok Enterprise",
      "headline": "Introducing Grok Business and Grok Enterprise",
      "url": "https://x.ai/news/grok-business",
      "published_at": "2025-12-30T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b051e91d62"
    },
    {
      "id": "mistral|https://mistral.ai/news/mistral-ocr-3",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Introducing Mistral OCR 3",
      "headline": "Introducing Mistral OCR 3",
      "url": "https://mistral.ai/news/mistral-ocr-3",
      "published_at": "2025-12-23T22:19:09Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "faebfb7fb6"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/googles-year-in-review-8-areas-with-research-breakthroughs-in-2025",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Google's year in review: 8 areas with research breakthroughs in 2025",
      "headline": "Google's year in review: 8 areas with research breakthroughs in 2025",
      "url": "https://deepmind.google/blog/googles-year-in-review-8-areas-with-research-breakthroughs-in-2025",
      "published_at": "2025-12-23T17:01:02Z",
      "description": "Google 2025 recap: Research breakthroughs of the year",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8e297e98b0"
    },
    {
      "id": "xai|https://x.ai/news/grok-collections-api",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Collections API",
      "headline": "Grok Collections API",
      "url": "https://x.ai/news/grok-collections-api",
      "published_at": "2025-12-22T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "50d2d7568a"
    },
    {
      "id": "xai|https://x.ai/news/us-gov-dept-of-war",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Supporting the DOW's mission with AI",
      "headline": "Supporting the DOW's mission with AI",
      "url": "https://x.ai/news/us-gov-dept-of-war",
      "published_at": "2025-12-22T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "068002cf15"
    },
    {
      "id": "mistral|https://mistral.ai/news/devstral-2-vibe-cli",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Introducing: Devstral 2 and Mistral Vibe CLI.",
      "headline": "Introducing: Devstral 2 and Mistral Vibe CLI.",
      "url": "https://mistral.ai/news/devstral-2-vibe-cli",
      "published_at": "2025-12-17T13:58:38Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "d43c99bc1c"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemini-3-flash-frontier-intelligence-built-for-speed",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemini 3 Flash: frontier intelligence built for speed",
      "headline": "Gemini 3 Flash: frontier intelligence built for speed",
      "url": "https://deepmind.google/blog/gemini-3-flash-frontier-intelligence-built-for-speed",
      "published_at": "2025-12-17T11:58:17Z",
      "description": "Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "5671ac06b0"
    },
    {
      "id": "xai|https://x.ai/news/grok-voice-agent-api",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Voice Agent API",
      "headline": "Grok Voice Agent API",
      "url": "https://x.ai/news/grok-voice-agent-api",
      "published_at": "2025-12-17T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "955ff4e6f2"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/gemma-scope-2-helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior",
      "headline": "Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior",
      "url": "https://deepmind.google/blog/gemma-scope-2-helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior",
      "published_at": "2025-12-16T10:14:24Z",
      "description": "Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ce406e3874"
    },
    {
      "id": "mistral|https://mistral.ai/news/mathstral",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "MathΣtral",
      "headline": "MathΣtral",
      "url": "https://mistral.ai/news/mathstral",
      "published_at": "2025-12-12T23:20:37Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "38397f4905"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/improved-gemini-audio-models-for-powerful-voice-experiences",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Improved Gemini audio models for powerful voice experiences",
      "headline": "Improved Gemini audio models for powerful voice experiences",
      "url": "https://deepmind.google/blog/improved-gemini-audio-models-for-powerful-voice-experiences",
      "published_at": "2025-12-12T17:50:50Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "e3ff2d5f8b"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/deepening-our-partnership-with-the-uk-ai-security-institute",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Deepening our partnership with the UK AI Security Institute",
      "headline": "Deepening our partnership with the UK AI Security Institute",
      "url": "https://deepmind.google/blog/deepening-our-partnership-with-the-uk-ai-security-institute",
      "published_at": "2025-12-11T00:06:40Z",
      "description": "Google DeepMind and UK AI Security Institute (AISI) strengthen collaboration on critical AI safety and security research",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0193544a8d"
    },
    {
      "id": "xai|https://x.ai/news/el-salvador-partnership",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "xAI and El Salvador Pioneer the World's First Nationwide AI Education Program",
      "headline": "xAI and El Salvador Pioneer the World's First Nationwide AI Education Program",
      "url": "https://x.ai/news/el-salvador-partnership",
      "published_at": "2025-12-11T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "f34084d619"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/strengthening-our-partnership-with-the-uk-government-to-support-prosperity-and-security-in-the-ai-era",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Strengthening our partnership with the UK government to support prosperity and security in the AI era",
      "headline": "Strengthening our partnership with the UK government to support prosperity and security in the AI era",
      "url": "https://deepmind.google/blog/strengthening-our-partnership-with-the-uk-government-to-support-prosperity-and-security-in-the-ai-era",
      "published_at": "2025-12-10T14:59:21Z",
      "description": "Deepening our partnership with the UK government to support prosperity and security in the AI era",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3cc5d90f1d"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/facts-benchmark-suite-systematically-evaluating-the-factuality-of-large-language-models",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "FACTS Benchmark Suite: Systematically evaluating the factuality of large language models",
      "headline": "FACTS Benchmark Suite: Systematically evaluating the factuality of large language models",
      "url": "https://deepmind.google/blog/facts-benchmark-suite-systematically-evaluating-the-factuality-of-large-language-models",
      "published_at": "2025-12-09T11:29:03Z",
      "description": "Systematically evaluating the factuality of large language models with the FACTS Benchmark Suite.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "cdd3011cce"
    },
    {
      "id": "deepmind|https://deepmind.google/blog/engineering-more-resilient-crops-for-a-warming-climate",
      "source_id": "deepmind",
      "source_name": "Google DeepMind",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "deepmind"
      ],
      "title": "Engineering more resilient crops for a warming climate",
      "headline": "Engineering more resilient crops for a warming climate",
      "url": "https://deepmind.google/blog/engineering-more-resilient-crops-for-a-warming-climate",
      "published_at": "2025-12-04T16:23:24Z",
      "description": "Scientists are using AlphaFold to strengthen a photosynthesis enzyme for resilient, heat-tolerant crops.",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "28079b6b2a"
    },
    {
      "id": "mistral|https://mistral.ai/news/ki-fur-deutschland",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Mistral AI - KI für Deutschland",
      "headline": "Mistral AI - KI für Deutschland",
      "url": "https://mistral.ai/news/ki-fur-deutschland",
      "published_at": "2025-11-21T00:26:09Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "4f522dd70d"
    },
    {
      "id": "xai|https://x.ai/news/grok-4-1-fast",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok 4.1 Fast and Agent Tools API",
      "headline": "Grok 4.1 Fast and Agent Tools API",
      "url": "https://x.ai/news/grok-4-1-fast",
      "published_at": "2025-11-19T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ea57b0cb49"
    },
    {
      "id": "xai|https://x.ai/news/grok-goes-global",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok goes Global with KSA",
      "headline": "Grok goes Global with KSA",
      "url": "https://x.ai/news/grok-goes-global",
      "published_at": "2025-11-19T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3e0c7a83b6"
    },
    {
      "id": "xai|https://x.ai/news/grok-4-1",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok 4.1",
      "headline": "Grok 4.1",
      "url": "https://x.ai/news/grok-4-1",
      "published_at": "2025-11-17T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "b3b70ff4bb"
    },
    {
      "id": "mistral|https://mistral.ai/news/codestral",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Codestral",
      "headline": "Codestral",
      "url": "https://mistral.ai/news/codestral",
      "published_at": "2025-10-30T16:18:53Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9c2a77f43d"
    },
    {
      "id": "mistral|https://mistral.ai/news/ai-studio",
      "source_id": "mistral",
      "source_name": "Mistral",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "mistral"
      ],
      "title": "Introducing Mistral AI Studio.",
      "headline": "Introducing Mistral AI Studio.",
      "url": "https://mistral.ai/news/ai-studio",
      "published_at": "2025-10-24T07:21:44Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8800d63915"
    },
    {
      "id": "xai|https://x.ai/news/onegov",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Expanding xAI for Government with GSA OneGov",
      "headline": "Expanding xAI for Government with GSA OneGov",
      "url": "https://x.ai/news/onegov",
      "published_at": "2025-09-25T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "6f8458f06d"
    },
    {
      "id": "xai|https://x.ai/news/grok-4-fast",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok 4 Fast",
      "headline": "Grok 4 Fast",
      "url": "https://x.ai/news/grok-4-fast",
      "published_at": "2025-09-19T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "0bbda0ebe2"
    },
    {
      "id": "xai|https://x.ai/news/grok-code-fast-1",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Code Fast 1",
      "headline": "Grok Code Fast 1",
      "url": "https://x.ai/news/grok-code-fast-1",
      "published_at": "2025-08-28T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "ba70e8eb92"
    },
    {
      "id": "xai|https://x.ai/news/government",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Announcing xAI for Government",
      "headline": "Announcing xAI for Government",
      "url": "https://x.ai/news/government",
      "published_at": "2025-07-14T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "14ff5dc9a0"
    },
    {
      "id": "xai|https://x.ai/news/grok-4",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok 4",
      "headline": "Grok 4",
      "url": "https://x.ai/news/grok-4",
      "published_at": "2025-07-09T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9bd65a5103"
    },
    {
      "id": "xai|https://x.ai/news/grok-3",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok 3 Beta — The Age of Reasoning Agents",
      "headline": "Grok 3 Beta — The Age of Reasoning Agents",
      "url": "https://x.ai/news/grok-3",
      "published_at": "2025-02-19T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "9f7e89a158"
    },
    {
      "id": "xai|https://x.ai/news/series-c",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "xAI raises $6B Series C",
      "headline": "xAI raises $6B Series C",
      "url": "https://x.ai/news/series-c",
      "published_at": "2024-12-23T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "3432828b4e"
    },
    {
      "id": "xai|https://x.ai/news/grok-1212",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Bringing Grok to Everyone",
      "headline": "Bringing Grok to Everyone",
      "url": "https://x.ai/news/grok-1212",
      "published_at": "2024-12-12T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "28a261c7af"
    },
    {
      "id": "xai|https://x.ai/news/grok-image-generation-release",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok Image Generation Release",
      "headline": "Grok Image Generation Release",
      "url": "https://x.ai/news/grok-image-generation-release",
      "published_at": "2024-12-09T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "8b11c0e2d6"
    },
    {
      "id": "xai|https://x.ai/news/api",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "API Public Beta",
      "headline": "API Public Beta",
      "url": "https://x.ai/news/api",
      "published_at": "2024-11-04T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "724e122a99"
    },
    {
      "id": "xai|https://x.ai/news/grok-2",
      "source_id": "xai",
      "source_name": "xAI",
      "category": "lab",
      "category_emoji": "🧠",
      "hashtags": [
        "lab",
        "xai"
      ],
      "title": "Grok-2 Beta Release",
      "headline": "Grok-2 Beta Release",
      "url": "https://x.ai/news/grok-2",
      "published_at": "2024-08-13T00:00:00Z",
      "description": "",
      "summarize": false,
      "summarized": false,
      "summarized_at": null,
      "anchor": "87b7dc1620"
    }
  ]
}
