flowchart LR A[cronR 07:00] --> B[R script<br/>fetch_news.R] B --> C[News API<br/>JSON endpoint] C --> B B --> D[openai::create_chat_completion()] D --> B B --> E[Quarto render<br/>daily_report.qmd] E --> F[HTML file<br/>/docs/news.html] F --> G[GitHub Pages<br/>& e‑mail alert]
Automating Commodity‑Market Intelligence with ChatGPT Plus and R:
Building a Daily News Agent for Corn, Soybean & Other Agricultural Assets
This article documents how, with the help of ChatGPT Plus (USD 20 / month) as a coding co‑pilot, we designed and implemented an agent that fetches, filters, and summarises the latest news on corn, soybean, and other commodities that compose my PhD research portfolio. The workflow is built entirely in R and Quarto: it pulls headlines from a public news API, uses OpenAI’s chat‑completion endpoint for abstractive summarisation, and renders a self‑contained HTML report that is automatically updated every morning at 07:00. The step‑by‑step recipe—covering API interaction, prompt engineering, scheduling with cronR, and reproducible publishing—should serve as a template for researchers and practitioners who need continuous, low‑maintenance market intelligence.
1 Introduction
Keeping track of news that can move commodity prices is critical for risk management, forecasting, and portfolio re‑balancing. Yet manually scanning dozens of sources is time‑consuming and error‑prone. With the advent of large language models (LLMs) delivered as Software‑as‑a‑Service—such as ChatGPT Plus at USD 20 per month—we can now prototype intelligent news agents in hours, not weeks. This paper recounts the exact conversation‑driven workflow we followed inside ChatGPT to:
Specify functional requirements (sources, frequency, output format).
Generate and refine R code that hits a news API, parses JSON, and stores headlines in tidy form.
Craft prompts that steer OpenAI’s API to summarise clusters of related articles in plain English.
Schedule the pipeline on a headless Linux box so a fresh HTML report lands in my inbox—and on our Quarto site—every weekday at 07:00.
All code is reproducible and free of proprietary dependencies except for the API keys themselves.
2 System Overview
The pipeline has three external touchpoints:
News API – We use https://newsapi.org for simplicity; any REST aggregator that supports keyword search and ISO 8601 dates will work.
OpenAI API – Handles abstractive summarisation so that end‑users read concise bullets instead of raw headlines.
GitHub Pages (optional) – Hosts the rendered HTML; an e‑mail task runs in parallel to push the file as an attachment.
3 Implementation Details
3.1 Environment & Secrets
Create a project‑level .Renviron (not committed to VCS):
Code
= "YOUR_NEWSAPI_KEY"
NEWSAPI_KEY = "YOUR_OPENAI_KEY" OPENAI_API_KEY
Then restart R or call readRenviron(“~/.Renviron”).
3.2 Helper Functions
Code
library(httr2) # next‑gen HTTP for R
library(jsonlite)
library(dplyr)
library(tidyr)
library(lubridate)
library(purrr)
library(glue)
library(openai) # remotes::install_github("rOpenAI/openai")
# Small wrapper ----------------------------------------------------------
<- "https://newsapi.org/v2/everything"
news_endpoint
<- function(keyword,
fetch_news from = Sys.Date() - 1,
to = Sys.Date(),
page_size = 100) {
<- request(news_endpoint) |>
resp req_url_query(
q = keyword,
language = "en",
sortBy = "publishedAt",
from = as.character(from),
to = as.character(to),
pageSize = page_size,
apiKey = Sys.getenv("NEWSAPI_KEY")
|>
) req_perform()
<- resp |>
out resp_body_json() |>
::pluck("articles") |>
purrr::as_tibble() |>
tibblemutate(
keyword = keyword,
publishedAt = ymd_hms(publishedAt, tz = "UTC")
)return(out)
}
# Summarise a tibble of articles with OpenAI ------------------------------
<- function(df, model = "gpt-4o-mini") {
summarise_cluster
<- glue("
prompt You are a financial analyst. Produce a 5‑bullet summary of the following
{nrow(df)} news headlines about **{unique(df$keyword)}** published in the last 24 h.
Focus on market‑moving information (prices, policy, weather, supply‑demand).
Write in clear, jargon‑free English. 120 words max.
HEADLINES:
{paste0('- ', df$title, collapse = '\n')}
")
<- openai::create_chat_completion(
res model = model,
messages = list(
list(role = "system",
content = "You are an expert commodity market analyst."),
list(role = "user", content = prompt)
),temperature = 0.3
)
<- res$choices[[1]]$message$content
summary tibble(keyword = unique(df$keyword), summary = summary)
}
3.3 Daily Driver Script (fetch_news.R)
Code
library(dplyr)
library(purrr)
source("functions.R") # helpers above
<- c("corn", "soybean", "soybean meal", "soybean oil",
commodities "wheat", "coffee", "cotton")
# 1 Pull raw headlines -----------------------------------------------------
<- map_dfr(commodities, fetch_news)
news_raw
# 2 Deduplicate & keep latest headline per source --------------------------
<- news_raw |>
news_clean distinct(url, .keep_all = TRUE)
# 3 Generate bullet summaries via OpenAI -----------------------------------
<- news_clean |>
news_summaries group_split(keyword) |>
map_dfr(summarise_cluster)
# 4 Persist to disk for the Quarto doc -------------------------------------
#saveRDS(news_clean, file = "data/news_headlines.rds")
#saveRDS(news_summaries, file = "data/news_summaries.rds")
3.4 Reporting with Quarto (daily_report.qmd)
Code
---
: "Daily Commodity News Digest"
title:
format:
html-contained: true
self: false
toc: cosmo
theme---
library(gt)
library(glue)
#news_clean <- readRDS("data/news_headlines.rds")
#news_summaries <- readRDS("data/news_summaries.rds")
|>
news_summaries mutate(summary = gsub('\\n', ' ', summary)) |>
::kable()
knitr
|>
news_clean select(publishedAt, source.name, title, url) |>
arrange(desc(publishedAt)) |>
::gt() |>
gt::fmt_datetime(publishedAt, rows = everything(), sep = " ") gt
The content file produced above is minimal; we keep heavy tables folded behind Quarto’s code‑fold UI so the landing page loads fast.
Code
## 3.5 Scheduling with **cronR**
library(cronR)
<- cron_rscript("fetch_news.R")
cmd
cron_add(command = cmd,
frequency = 'daily',
at = "07:00",
description = "Fetch & summarise commodity news")
The same cron entry can also trigger quarto render daily_report.qmd. If you host your site on GitHub Pages, push the rendered HTML to the docs/ folder and commit. GitHub will automatically redeploy.
4 Results
For 15 June 2025 the agent produced the following high‑level summary (example):
Corn • USDA trimmed 2024/25 US corn output by 2 Mt on flooding in Iowa • China booked 165 kt US cargo as domestic prices spike • Brazil’s 2nd‑crop harvest reaches 38 %, dryness persists in Mato Grosso • CME December futures up 2 % to USD 4.71 /bu • UN FAO projects stable global ending stocks at 319 Mt
The corresponding HTML report (45 kB, self‑contained) was rendered in ~1.3 s on an 8‑year‑old laptop.
5 Discussion
Low code debt. ChatGPT wrote ~80 % of the R required; I mainly refactored style and added edge‑case handling.
Cost. With the gpt‑4o‑mini model the OpenAI bill is ≈ USD 0.03 per day—trivial compared with Bloomberg or Refinitiv.
Latency. End‑to‑end pipeline averages 5–7 seconds.
Extensibility. Swap NewsAPI for GDELT 2.0 or AlphaSense by editing two lines in fetch_news().
Limitations. Summaries inherit the biases of both the underlying sources and the LLM; validation against raw headlines remains crucial.
6 Conclusion
Leveraging ChatGPT Plus as a pair‑programmer allowed us to bootstrap a reliable, fully automated commodity news agent in less than a day. The resulting workflow—cron → R → OpenAI → Quarto—delivers timely, actionable intelligence to support my doctoral research in multi‑asset optimisation without incurring enterprise‑grade data costs.
References
Kuiper, M. et al. httr2: A next‑generation HTTP client for R. R package version 0.2.
Seligman, J. & Wright, T. cronR: Schedule R scripts from R.
OpenAI. API Reference. https://platform.openai.com/docs
NewsAPI Ltd. NewsAPI v2 Documentation. https://newsapi.org/docs