Category: DataScience
-
Agint: Agentic Graph Compilation for Software Engineering Agents

ref: https://arxiv.org/pdf/2509.00625 webpage: https://www.agintai.com/ Summary Agint differs from workflow agents as a compiler agent (like IDE). It translates natural language into an executable, result-oriented DAG, where operating at the graph level enables parallel execution and removes the constraint of linear, chain-based generation. This approach emphasizes result-oriented execution: workflows are not fixed upfront but are dynamically…
-
NetGent: Agent-Based Automation of Network Application Workflows

ref: https://arxiv.org/pdf/2509.00625 github: https://github.com/SNL-UCSB/netgent Summary The paper introduces a state machine logic, similar to how games operate, into the field of UI automation. It compiles natural language into reusable, iterative states, where each state uses LLM-based reasoning for action selection and execution. This approach extends ReAct by adding explicit state memory and caching (compile-then-replay), reducing…
-
Needle in the Web: A Benchmark for Retrieving Targeted Web Pages inthe Wild
ref: https://arxiv.org/pdf/2512.16553 github: https://github.com/Tango-Whiskyman/Needle_in_the_Web Summary Needle in the Web explores a new benchmark for evaluating LLM search agents. It uses a broadcast + parallel retrieval approach (fuzzy exploratory search) instead of traditional multi-hop reasoning. Retrieved webpages are verified to ensure all query criteria are satisfied using single source, selecting a “ground-truth” page for answer generation.…
-
ScreenAgent : A Vision Language Model-driven Computer Control Agent
ref: https://arxiv.org/pdf/2402.07945 github: https://github.com/niuzaisheng/ScreenAgent Summary Performed end2end LLM agent development by constructing a real desktop interaction environment through VNC, enabling the agent to perceive screenshots and issue mouse and keyboard actions. A UI automation process was introduced, with actions formalized as function calls and organized into planning, action, and reflect loops. Within the acting and reflecting…
-
Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework
ref: https://arxiv.org/pdf/2511.15168 Summary Trained a new LLM to understand web form (HTML code) and generate reliable Selenium script for webpage testing. Differ to WebVoyager, which rely on visual navigation (UI Agent). This method model UI automation to a deterministic code generation working on UI/UX. While this improve execution, the approach still suffer typical code generation failure.…
-
The Iceberg Index: Measuring Skills-centered Exposure in the AI Economy
ref: https://arxiv.org/pdf/2510.25137 Summary Performed a workforce “digital twin” simulation on human capabilities overlap with AI. Existing workforce metrics does not work on AI assisted task/skill. A newly metric is introduce to identify task/skill correlate to wages. While tech role (programmer, data science & program manager) already disputed, repetitive cognitive and administrative work remain largely invisible.
-
Predicting 100% in IRIS dataset

dataset: https://archive.ics.uci.edu/dataset/53/iris While scrolling through YouTube, I came across this video:https://www.youtube.com/watch?v=MdOCu2Gr-0g It explores Fibonacci numbers, which sparked a thought—could I experiment with them in a unique way, perhaps using the Iris dataset? First, let’s create a sequence of Fibonacci numbers. Result: Next, we create a function that returns the largest Fibonacci number closest to a…
