Upscaling and an Important Note About Photo “AI”

One of these is a photo. One is a digital illustration.
John Scalzi

Because I’m a digital photography nerd, I have a lot of programs and Photoshop plugins designed to tweak photos and make them better, or, maybe more accurately, less obviously bad. One of the hot new sectors of digital photography programs is the one where “Artificial Intelligence” is employed to do all manner of things, including colorizing, editing and upscaling. Some of this is baked into Photoshop directly — Adobe has a “Neural Filters” section for this — while other companies are supplying standalone programs and plugins.

Truth be told, all of these companies have been touting “AI” for a while now. But in the last couple of iterations of these tools and programs, there’s been a real leap in… well, something, anyway. The quality of the output of these tools has become strikingly better.

As an example, I present to you the before and after picture above. The original picture on the left was a 200-pixel-wide photo of Athena as a toddler. There had been a larger version of it way back when, but I had cropped it way down for an era when monitors were 640 x 480, and then tossed or misplaced the original photo. So the blocky, blotchy low-resolution picture of my kid is the only one I have now. The picture on the right is a 4x upscaling using a program called Topaz GigaPixel AI, which takes the information from the original picture, and using “AI,” makes guesses at what the picture should look like at a higher resolution, then applies those guesses. In this case, it guessed pretty darn well.

Which is remarkable to me, because even just a couple iterations of the GigaPixel program back, it wasn’t doing that great of a job to my eye — it could smooth out jagged edges on photos just fine, but it was questionable on patterns and tended to make a hash of faces. Its primary utility was that it could do “just okay” attempts at upscaling much faster than I could do that “just okay” work on my own. This iteration of the program, however, does better than “just okay,” more frequently than not, and now does things well beyond my own skill level.

It’s still not perfect; some other pictures of Athena from this era that I upsampled didn’t quite guess her face correctly, so she didn’t look as much like she actually did at the time, and more like a generic toddler. But that generic toddler looked perfectly reasonable, and not like a machine-generated mess. That counts as an improvement.

Now, it’s important to acknowledge a thing about these new “AI”-assisted pictures, which is that they are no longer photographs. They’re something different, closer to a digital illustration than anything else. The upscaled picture of Athena here is the automated equivalent of an artist making an airbrushed painting of my kid based on a tiny old photo. It’s good, and it’s pretty accurate, and I’m glad I have a larger version of that tiny image. But it’s not a photograph anymore. It’s an illustrated guess at what a more detailed version of the photograph would have been.

Is this a problem? Outside of a courtroom, probably not. But it’s still worth remembering that the already-extremely-permeable line between photograph and illustration is now even more so. Also, if you weren’t doing so already, you should treat any “photo” you see as an illustration until and unless you can see the provenance, or it’s from a trusted source. This is why, incidentally, AP and most other news organizations have strict limits on how photos can be altered. I’d guess that a 4x “AI”-assisted enhancement would fall well outside the organization’s definition of acceptable alteration. So, you know, build that into your world view. In a world of social media filters turning people into cats or switching their gender presentation, this internalization may not be as much of a sticking point as it once was.

With that said, it’s still a pretty nifty thing, and I will play with it a lot now, especially for older, smaller digital pictures I have, and to (intentionally) make illustrations that are based from those upscaled originals. I’m glad to have the capability. And that capability is only going to get more advanced from here.

— JS

21 Comments on “Upscaling and an Important Note About Photo “AI””

  1. The term upscaling in itself is misleading, this is not what is happening. The algorithm adds information to the image to make it look more detailed than it is. This information is derived from the learning data. The end result is a mix of original data and learning data that does represent what most humans would expect a higher quality photo to look like. However this is because the learning algorithm trained the network to optimize for this expectation, there is no knowledge at all about the real detail that would have been in a higher quality original.
    What is the big deal? The big deal comes if a photo with a slightly blurry face in it has been “enhanced” and that face now resembles a living person. However the detail that makes up that “person” comes from a cloud of data mixed from thousands of human faces, none of which is likely to be the person the image now resembles. There is no proof of anything here, the orignial might not even be an actual face, just something the algorithm interpreted as such.
    Now let us guess how many people will understand this and take it into account when presented with “photogrphic evidence”.

  2. Knowing nothing about the technology behind it, I am guessing that they took high rez and low rez images of the same things to set the standards of what “right” upscaling looks like. But I reminded of the oft cliched “Enhance” scene from Blade Runner. Hollywood’s been imagining this capability for years.

    I could see using this as part of an investigation, but our host is right, it should not be used to point the finger at a suspect in a court of law.

  3. Very nifty, as you said, and great to be able to do this with old photos of family or friends. I would be interested to know whether the AI does as good a job on faces of people of color, especially as we learn every day of problems concerning the racism that is melded into new technologies.

  4. I’m exceeding glad for the rules regarding photographs with the AP et al. If doctoring was allowed we would see a lot worse than we already are lead to believe. But I agree these filters and programs are wonderful for restoration.

  5. The level of detail and accuracy on this sort of upscaling also depends a lot on the quality of the image going in; upscaling 640×480 by 4x is going to be a lot more “AI best guess”-looking than upscaling a 1920×1080 photo 2x. Like most of these sorts of “AI”, they’re trained on various patterns. Give ’em more info to start with, they’ll apply that training more accurately.

    GigaPixel has been great at converting my half-decent phone photos into something printable. The photos had good enough resolution to start with that the output looked pretty much the same, minus noticable jaggies/Jpeg artifacting.

    Topaz’s noise reduction app is pretty great at its job, too.

    It’s pretty neat tech, overall.

  6. It would be interesting to compare an original photo at +4K versus that same photo cropped and downsized to 640×480 then AI processed up to 4K. Especially faces. Someone with a good camera get on that will you? Thanks.

  7. What are your thoughts on Topaz GigaPixel AI vs the baked-in “Super Resolution” option in Adobe Photoshop?

    I’ve been playing with generating images in MidJourney AI, and seen some comments about using GigaPixel to upscale the images further, but I already have a Creative Cloud subscription, so I’m wondering if it’s worth getting another paid tool.

  8. It’s interesting to see the evolution of this. Certainly, the AI-upscaled photo looks smoother to the eye, but as you point out, it’s an illustration, now. The program has made guesses, often quite good ones, but it’s adding information that wasn’t there before. In many cases it’s hard to tell (cue Harrison Ford saying ‘Enhance!’), but in a few places it looks off to me (or changed enough to be noticable). The smiley-faces on the balloon, for example, seem different, especially on the lower right. Is the ‘enhanced’ version actually more accurate to reality than the original? Maybe, but we have no way of knowing. Which is fine, really…as long as this isn’t being used in a court or a similar context, as you note.

    Part of me finds stuff like this a little scary…but part of me also thinks that stuff like ‘They Shall Not Grow Old” are amazing technical achievements that are of immense value (as long as we recognize what’s been done there).

  9. It looks like it took the original shirt (with a tiny pattern on it) and made it into a solid shirt which averages out the colors, which I never would have guessed if I only had the second photo.

    I wish photos came with the “I was altered in these ways” info baked into the metadata, but of course that isn’t feasible. I’m glad the AP has rigid guidelines!

  10. Thanks. Interesting commentary and application. Have you used Topaz’s other De-Noise and Sharpen apps? Curious about their value/performance, and whether the t app package deal is worth the investment. Thanks!

  11. Before reading any of the text, the photo on the right, especially the eyes and cheeks, hit uncanny valley levels for me.

  12. I suspect that KC is right about the shirt, but since it’s very unlikely the Scalzis still have it, it’s hard to be certain. But I think a human probably wouldn’t have made the same “editorial” decision; even if they knew there wasn’t enough information to know what the pattern of slightly different colors on the shirt was, they wouldn’t have assumed it was pure noise.

  13. Ain’t the 21st Century grand? I’m using Topaz’s Video Enhance AI to upscale some standard-definition video I shot decades ago to Full HD, and the results are…impressively good.

    It’s not as good as if I’d originated the video in 1080p – for one thing, the 4:3 framing remains, so everything is either pillarboxed like I LOVE LUCY or The Synder Cut, or I have to cut off the tops and bottoms of the frame like too many older people do when they want their old John Wayne movies to “fill” modern 16:9 screens! But with a little fiddling it looks good enough to play on a HD screen with the side bars….

  14. Anthea Strezze, if it helps any Topaz’s products are a one-time only charge rather than a(n expensive) monthly subscription.

    Better, there’s a 30-day free trial, so you can compare for yourself between PS’s “Super Resolution” and Gigapixel AI.

  15. As someone who worked in the design of digital hardware image processing, lemme just say that the old school digital upscaling had to make up numbers to fill in for those new pixels it was creating. We came up with hand coded algorithms to try to detect edges and maintain those edges in the new pixels, and so on. But in the end, 2x upscaling took 1 pixel and converted it into 4 pixels and the new pixels all had to get assigned made up values.

    If that isnt a “digital illustration” then the AI version of upscaling isnt either.

    Both algorithms are subject to errors of assumptions of the designers. Both produce imaginary pixels to fill in the gaps.

    The difference with AI is the algorithm is no longer something a human can really understand. When you train a network (deep learning) you might choose how deep to make the network, how many layers between the input image and output image. Every cell of every intermediate layer might turn into a “thought” or assumption or idea of the network. And what that intermediate cell thought is is entirely dependent on the training data you feed it.

    Same network size, same number of intermediate layers, and the exact same cell might mean different things and it can be difficult for humans to extract the meaning of any particular cell.

    Not only can you have bad data, incomplete data, but you can also have too much data to train your ai on. Too much data and the ai starts to basically memorize the data and doesnt deal well with inputs that are different from training data

    About the only thing certain about ai is that there are no 3 laws of ai.

  16. I agree that the changes made to the low-resolution picture of Athena move it out of the “photograph” category, but maybe the traditional distinctions between “photograph” and “illustration” need to be revisited. A digital “photograph” captures light and splits it into pixels, and presumably any modification to those pixels begins to move the image toward being an illustration. That said, what about a RAW image compressed to a JPEG? What is the “photograph” in that case? And what if that compression happens in the camera, so that the only saved image already reflects a series of changes? Is it a “we know it when we see it” test? That might be as good as we can get, though it’s not particularly satisfying.

    I appreciate that AP mandates that all changes be made in the pursuit of a “clear and accurate reproduction” and which will “restore the authentic nature of the photograph.” But “authenticity” feels a little slippery here, particularly when it comes to things like exposure levels, color balance, sharpening edges, and the like. At the very least, the definition of “authentic” does not strike me as self-evident.

  17. KC:”It looks like it took the original shirt (with a tiny pattern on it) and made it into a solid shirt which averages out the colors”

    I would guesd the original shirt probably had some kind of moire pattern that messed up the original camera. And the ai transformation is probably more accurately representing how the shirt looked to human eyes.

    Moire patterns are an example of why calling the original a “photo” and the upscaled-ai version a “illustration” doesnt really work.

    Even a raw digital image can have weird digital artifacts that arent in the object being photographed. Moire artifacts show up when the pattern in real life requires a higher resolution than the camera sensor has, and you get visual artifacts, some pattern that appears in the raw photo that does not exist in real life.

    The video equivalent of this is when someone records a helicopter and the frame rate (30 fps) is way too slow to capture the rotors turning at 200 to 600 rpm, and the video looks like the blades arent rotating hardly at all.

    Digitizing audio can have a similar problem if the sample rate is lowerer than the higest frequency in the original signal, but with audio, you can run the analog audio through a low pass filter before you start sampling/digitizing.

    Theres no analog eqivalent to a low pass filter for imaging, so you get artifacts.

  18. I find the “illustration” to be very impressive. I would be easily fooled into thinking that it is an unaltered photograph.

  19. You some excellent points. I’m a local historian and the thin line between photograph, printed photograph, edited or enhanced copy and something altogether different bedevils me and others as we collect digital based images. It will be interesting as we figure out the best way to further elaborate the formerly usual caption explanations. Steve.
    The new photomultiverse.

  20. Not my territory, at all, but the impression I get is that the AI takes the smileys and the shirt and says “I don’t know what’s going on here, but this is what those things generally look like.”

%d bloggers like this: