By that time, Davy Kirkpatrick said, we were sneaking to 200 billion rows in the table of all of our individual detections we had made in the last ten years.

The field of engineering usually progresses when one ceases to consider a set of data as a byproduct and begins considering it as a tool. Over a ten year period, the NEOWISE program of NASA gathered a rolling infrared survey of the entire sky designed to aid in detecting near-Earth object, but full of messages of much more distant origin. Within those measures was another sort of sky, things flickering or pulsing or dark, not as a result of a movement on the part of a telescope, but as a result of the transit of a universe.
The archive of NEOWISE is not a well-organized atlas. It is a flood of single-exposure detections almost 200 billion entries in which every point is a momentary reading of infrared brightness measurement which needs to be related to a physical object and then analyzed with time. The traditional methods are only able to investigate variability in smaller slices, yet when one attempts a uniform, all-sky census the problem of finding patterns that are human-scale becomes exceeded. The main problem in the lab at Caltech’s Infrared Processing and Analysis Center (IPAC) headed by Kirkpatrick was infrastructural: how to convert an awkward archive into a useful time-domain map without smoothing out rare, significant changes into statistical noise.
That limitation matched an exceptionally ambitious summer researcher. Matteo (Matthew) Paz, a student at Pasadena High School at the time he joined the group, having been attending Caltech public stargazing lectures since childhood, came in via the Summer Research Connection program and immediately did not intend to be restricted to the plan of a “small patch of sky.” His idea was to develop a system of AI that would scan the whole database not at a fixed rate, but at variability as such, the signature of quasars, pulsating stars, eclipsing binaries, and transient events.
The outcome of this process is a model called VARnet, which represents what is practically an engineering decision: to merge both signal-processing approaches that astronomers believe in with machine-learning approaches that scale. The pipeline involves wavelets to smooth the impact of spurious measurements as well as a fourier-based step to isolate periodic structure of irregularly sampled light curves followed by convolutional neural networks to categorize sources into four buckets: non-variable, transient, intrinsic pulsator, or eclipsing binary. In a peer-reviewed description by Paz, VARnet itself scored 0.91 on the F1 score, and ran sources actively in less than 53 microseconds per object on a GPU, enough to make the entire survey computationally feasible, not aspirational. The article is found in The Astronomical Journal.
Speed is, however, just a part of the story. The NEOWISE is a slow-paced observatory with a cadence due to its orbit and scanning plan, which creates strong and dark patches. The telescope discovered that the rhythm of this telescope is not very good at capturing objects that burst and die abruptly, or those that change in a subtle way over a period of many years as Paz and colleagues learned. The catalog is thus analogous to a sensor network engineered: strong in the range of its sampling and incomplete beyond this range, best used in conjunction with the subsequent observations of other facilities.
Despite all those limitations, the scale is impressive. The operation of VARnet on the archive identified approximately 1.5 million candidate sources of variable, or as they are termed, signals that are evidently varying in the infrared and now need astronomical validation and special re-observation. A project was also a case study in the way the modern discovery can come: not by new gear in space, but by new techniques on the ground by unlocking the data products previously “forgotten” by a mission. Paz subsequently said what in fact could be carried about: The model I developed could be applied to other time domain studies in astronomy, and possibly anything whatsoever that arrives in a time series form, with periodic components being important. There are also atmospheric influences that can be studied which include pollution, of which the periodic seasons and day night cycles of the atmosphere are of great significance.
The mentorship structure under the work was also important. Kirkpatrick, who also has written of the way a high school teacher assisted him in establishing his own direction into science, created a culture in the lab in which ambitious questions were not brushed off as distractors to the “real” project. By 2024, Paz was once again putting effort towards it and mentoring fellow students, as well as being employed by IPAC. To engineering-minded readers, the moral is not prodigy but process: a mature archive, a scaling algorithm, and a workflow that converts raw time series into a communal resource-one that can be cross-matched, stress-tested and extended as new sky surveys become accessible.

