Page Nav

HIDE

Breaking News:

latest

Ads Place

What Would a “Green AI” Look Like?

https://ift.tt/3COSvTW More Intelligent Than Artificial What would an environmentally-friendly AI look like in the future? Photo by Chel...

https://ift.tt/3COSvTW

More Intelligent Than Artificial

What would an environmentally-friendly AI look like in the future?

Photo by Chelsea on Unsplash

Disclaimer: This is an opinion piece, and the opinions expressed here only belong to the author.

You’re probably asking, “Raveena, just what exactly do you mean by a Green AI?” Well, I guess I did reveal my intention in the italic summary above, but that doesn’t exactly give us a solution to what environmentally-friendly AI could look like, since none of us can accurately predict the future 10 or 20 years––or even 5 years from now. Heck, just 9 years ago, the first massive deep neural network was created, able to bang out extreme record-breaking results on image classification––think of the ImageNet competition. Neural networks could now accurately classify real, digital images of tens of species of dogs (I’ll be honest with you right now, I had NO clue that “maltese dog” and “Tibetan terrier” were actual dog breeds until I peeked at the labels in the ImageNet dataset!), animals, objects, and the like. Over the last few years, we’ve built even better deep learn systems that can not only classify images but work with videos––I never would’ve expected that––and in my opinion, the most mind-blowing results–– creating text-to-image captions, and language models such as GPT-2, BERT and GPT-3 which have produced truly stunning results on producing human-like readable paragraphs and sentences. But is this the most efficient type of AI we can create––is this really progress toward “greener AI”?

I argue that in fact, it’s probably not the most efficient type of AI we can create, and I also believe that creating more and more deeper network models with more complicated architectures, isn’t really paving a road toward “greener AI.” For one, although these models may be beasts when it comes to language generation, caption generation and other tasks, they are also beasts of energy usage.

In a recent paper by Dr. Timnit Gebru and her colleagues, (Dr. Gebru was “relieved” of her position from Google for writing this paper) she cited a computational linguistics paper by Emma Sturbell, stating that training “a single large language transformer models from scratch” was “estimated to use around 300 tons of CO2, while humans “individually are responsible around 5 tons of CO2 a year.” (btw, Transformer model is just a fancy word for “big” deep learning language model) In the same paper, Sturbell and her colleagues calculated that training a single BERT model from scratch would require the same energy and carbon footprint as a “commercial, passenger trans-Atlantic flight.” Sturbell, in the same paper calculated that increasing the accuracy of these language models by less than 10% costs an additional $150,000! Clearly, these energy usages and carbon footprints for training a single deep language model are way out of proportion compared to average human emissions.

Now, you could say, well, these models are only trained once, and then deployed to the cloud and can be used by others. However, deploying large deep-learning models to the cloud costs a lot of energy, as cited by Forbes in this article about NVIDIA––around 80% of the energy used for a model goes to inference alone. Additionally, even with the pre-trained models, often ordinary users like me and you would need GPUs to do transfer learning. Buying GPU’s can be prohibitively expensive for new machine-learning users, and often times GPU usage on the cloud isn’t 24/7 available for free. Granted, Google has developed more efficient chips, such as the Tensor Processing Unit (TPU), and other specialized chips––but Google is a corporate behemoth, with more money and resources than you or I will ever have in our lifetime — and should we really trust large corporations to democratize AI for all of us, while they source ever more oceans of data from us to train their newer, sexier, larger models?

What Can We Do?

So what’s to be done? I’m not implying in any way that deep learning is bad — deep learning is truly an engineering marvel in itself — but by itself, deep learning hasn’t really come close to the generalization flexibility of the human brain over a large variety of tasks. And personally, I believe proclaiming that deep learning is the singular solution to AI and intelligence, might not be the best idea.

The final point I’ll say about deep learning‘s limitations by itself is this: traditional deep-learning neural network models don’t have a good way of dealing with uncertainty. Here, let me give you an example:

Photo by Lubo Minar on Unsplash

Here, our human brains can tell with a large confidence that the photo above is of a metropolitan, perhaps downtown section of a big city — even though the photo is blurred. But traditional convolutional neural networks don’t have an in-built mechanism for displaying their own “network uncertainty” when it comes to image recognition.

Towards AI Flexibility

What we need, in my honest, humble opinion, is something cleverer than pure function approximation: we need to understand how inference and model-building works in the brain. Admittedly, this is like passing the “hot potato” down the line — having to answer probably a harder question than “How to democratize AI?” and “How to make a Greener AI?”, but let’s go with the flow and entertain this question.

If you’ve ever taken the SAT, you probably remember those dreaded questions called “sentence inference” questions — where a incomplete sentences is given, with a blank — and you have to fill in the blank (or many blanks) with the “best fit” word. So maybe an SAT exam example would be like this:

Since Kelly didn’t expect anything for Valentine’s Day, she was _________ by the bouquet of roses on her desk.

From the answer choices: 1) intrigued, 2) astonished, 3) love-struck, 4) disappointed, 5) irritated –– which word would you choose? What would your answer be here?

The answer to this question is (2), and I’m sure that might have been “obvious” to all of you, but would it really be that obvious to the AI systems we current have for natural-language processing? Many solutions could grammatically work, and there might be way to solve this using the statistics of frequently occurring words — but solving this efficiently — the answer we’re looking for best fits the context and meaning of the sentence, and also fits our intuition for how Kelly would feel.

Now, this example might not seem to involve probabilities on the surface of it — after all these are SAT words! But think about this: if you didn’t know much about the SAT and hadn’t studied for it, and just saw this question, you’d be making an educated guess right? You know, prior to seeing the question that Valentines is associated with roses, and we also have a prior intuitive model for how Kelly might react, given the information in the sentence and the fact that we have no other information about Kelly’s intentions, beliefs or emotions. This type of high-level, flexible, model-based abstract reasoning is something human brains are especially good at — and today’s AI systems are not really near this level, especially when it comes to AI trying to model the behaviors of other humans.

But in the last 10–15 years, cognitive scientists, cognitive psychologists and AI researchers together, have been making headway into creating machine systems that can do this type of reasoning: recently, it’s been called probabilistic programming. I won’t go into too much detail, so if you want to learn the basics of it before returning to this article, check out the TDS Editors’ September Edition article on it.

In my humble opinion, probabilistic programming offers a pathway to environmentally-friendly AI, because it creates a possibility to express a large amount of reasoning without the need for brute-forcing inference with thousands or millions of data points. When you see a picture of a car, or a motorcycle, you don’t need to know the hundreds or thousands of exact pixel locations and values in the photo: instead, you have some idea of the primitives of the object. The car wheels, chassis, windows, general edge shape, mirrors, an engine hood — these are all the “primitives” or building blocks of the object. With these building blocks, a human could draw almost an infinite variations of a car! In fact, researcher and computer graphics expert Daniel Ritchie has used probabilistic programming to generate realistic 3D virtual spaceships and other objects from just a few examples and primitives — the building-blocks — of the objects. And he details this process in an amazing video from the 2020 Probabilistic Programing Conference, where he uses a graphics program to render many variations of spaceships using its primitives — for example, the body of the ship, the wings, the thrusters.

Coming Back Full Circle

Speaking of Prof. Ritchie’s computer graphics work, I do want to come back full circle, to the beginning of this piece. In one of Ritchie’s work in 2016 (and in the video linked above), he notes that performing inference in probabilistic-programming — that is, inferring the hidden generative programs, especially in the context of graphics, given the probabilistic model, is a very complex task. The usual inference techniques include things called “random sampling” methods — but by themselves, these take quite a long time to approach an answer. In the video talk, Prof. Ritchie talks about computer-generating graphical representation of alphabets, based on a single sample image.

Using random sampling, the inference took about 10 minutes — but Ritchie was able to combine the probabilistic graphics program with another tool to generate the image 10 times faster. Now — what was this tool, you may ask? Coming back full circle, it was — lo’ and behold, a neural network. Turns out, neural networks are particularly good when it comes to the inference algorithms part of probabilistic programming — the network was able to “guide” the program toward finding faster values that optimized the graphics program.

Conclusion

I hope I’ve been able to show you, the reader, some of my thoughts around what a environmentally-friendly type of AI could look like in the near future. Of course, if we could predict the future accurately, time would be kinda meaningless, but if I had to take a guess — creating a greener AI would involve deeply understanding how the brain is able to perform intelligence in many ways that current AI methods are lacking. Green AI will have to involve hard-coding the types of abstract reasoning human brains can do, so we don’t have to store huge amounts of data in data centers and spend hundreds of thousands of dollars to train-from-scratch, new language models or other AI language models to imitate humans. Green AI also needs to have uncertainty built-in, to deal with the fact that the real world is messy, and not at all clean like the logical statements of mathematics. And crucially, Green AI will need input not just from computer scientists and mathematicians, but from cognitive science, cognitive psychology, neuroscience, cultural psychology, and many other diverse areas.


What Would a “Green AI” Look Like? was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.



from Towards Data Science - Medium https://ift.tt/3GNOLnU
via RiYo Analytics

No comments

Latest Articles