5/4/2020 Jenny Applequist, CSL
Written by Jenny Applequist, CSL
Every summer, the United States’ largest recurring “dead zone”—an area of the ocean in which oxygen levels are too low to support marine life—reappears in the Gulf of Mexico. One of its main causes is the huge quantities of excess agricultural fertilizer that flow into the Gulf from across the vast Mississippi River drainage basin. It’s a lose-lose situation in which farmers waste money by applying more fertilizer than their crops can use, and downstream ecosystems suffer from the resulting pollution.
AI research being done in the group of CSL professor Naira Hovakimyan has just received high-profile attention for making important progress towards a solution. On April 22, the work was featured in the prominent deeplearning.ai blog.
“We realized at the beginning of this research that farmers usually apply much more fertilizer than they need,” notes Alexandre Barbosa, a student (co-advised by Hovakimyan, the W. Grafton and Lillian B. Wilkins Professor of Mechanical Science and Engineering, and Prof. Nico Martin, assistant professor of crop sciences) who led the work. In addition to causing environmental damage, the extra fertilizer also reduces crop yields. “If you put more fertilizer than the plant can handle, you actually start killing the plant and you produce less. This is actually a very big question in agriculture; it’s driving a lot of research in this area.”
Of course, the problems caused by excess fertilization would be alleviated if farmers could make accurate determinations of the right amounts of fertilizer and seed to apply to their lands. Figuring out how much to apply isn’t too difficult if the cropland in question has uniform physical characteristics, such as elevation and soil quality. “When you have a very homogeneous field, and very flat, things are not varied—it’s easier to predict,” says Barbosa. The simple multiple linear regression algorithms that are already in use do it just fine.
The challenge is much greater, though, if the important attributes vary significantly across a field.
To tackle that challenge, Barbosa started developing four different convolutional neural network (CNN) architectures for predicting yield. Each of them uses a different approach to combine available data on a specific cornfield. To test the CNNs, he used data that had been collected from nine cornfields located in Illinois, Ohio, Nebraska, and Kansas.
He divided each of the fields into grids of 5-by-5-meter cells, and provided each CNN with data on five attributes of each cell: elevation, soil quality, satellite imagery, and varying levels of fertilization and seeding. For each CNN he evaluated how good the predictions were.
Barbosa discovered that the most effective CNN, dubbed “Late Fusion” (LF), outperformed not just his other three CNNs, but also the prior solutions, including a random forest model and a multiple linear regression model.
The LF architecture differs from the other CNNs in that it first analyzes each one of the five cell attributes by itself across the entire cornfield, examining how each individual attribute varies across all the cells in the cornfield. Only afterwards are the separate findings on the five attributes brought together to form a cornfield-wide analysis encompassing all five features.
Why would the LF strategy work the best? “Probably because of the physics of what’s happening in the field,” says Barbosa.
Early fusion would work better than late fusion if, for example, the fertilizer interacts with soil quality in a very fine-grained way, so that the model needed to capture the attributes’ interaction with high resolution. Barbosa’s findings suggest that the attributes in fact have a more generalized interaction with each other.
“Because of the characteristics of fields, it’s more efficient to make the feature extraction from each input independently and then combine them. It’s a more simple model and I think that’s one of the reasons it worked better,” he says.
Since the work featured in deeplearning.ai was published, Barbosa has had a second paper accepted that refines the optimization approach, which finds the best fertilizer and seed rate maps to improve yield. The next important step will be to quantify the uncertainty in the LF CNN’s recommendations, as that will be key to farmers’ decision-making process.
The original publication is “Modeling yield response to crop management using convolutional neural networks,” by Alexandre Barbosa, Rodrigo Trevisan, Naira Hovakimyan, and Nicolas F. Martin, Computers and Electronics in Agriculture, vol. 170, article 105197, March 2020, https://doi.org/10.1016/j.compag.2019.105197.