Spaces:

prs-eth
/

thera

Running on L4

Cropping absurdity

by AlekseyCalvin - opened 12 days ago

12 days ago

The mandatory cropping (applied even to very low resolution/small inputs!) appears to be entirely automated and does not take into account any compositional specificities of inputs. In practice, this makes this implementation absurdly unusable. For example, let's say I input a relatively small (say, 512x768) portrait of a person sitting. Your pre-processing script automatically crops out 75% of the image, including the subject's head and face, and instead renders a tile featuring an extreme zoom-in on the subject's hand. And because the image was already low-resolution in the first place (because, otherwise, why would I even use it in this Space), the cropping leaves me with a remainder fragment at an even lower resolution: so low, in fact, that likely no upscaler (and certainly not this implementation) would be able to improve it. I am aware that you've introduced the cropping script in the interest of not having those interest run into queues. And yes, I can guarantee that if the cropping feature stays in place, there wouldn't be any long queues here: but only because the frustration with the cropping would soon drive away all those interested.

GeneralAwareness

9 days ago

I do not see this as viable. Thera is in Topaz Video AI for a while and can easily do 4k upscales while this thing took my 360x360 at 3x and made it 100% bad. Is this thing really working even?

There is the image I downloaded.

alebeck

Photogrammetry and Remote Sensing Lab of ETH Zurich org 9 days ago

Hi, it looks like your input image contained compression artifacts, which is something that Thera currently can't handle well. We had also added a note about that to the demo.

It's important to note that the scope of our paper was about arbitrary-scale super-resolution (ASR), and for that we assumed inputs without artifacts. At a later stage, we might re-train the model with such data, but so far this has not been the focus. Other methods for ASR don't do that either, currently.

GeneralAwareness

9 days ago

•

edited 9 days ago

That is the image it gave me back. Oh, well, this is simply not a viable solution for real world work as was being demonstrated in the YouTube videos.

nandometzger

Photogrammetry and Remote Sensing Lab of ETH Zurich org 9 days ago

Where did you see Thera in Topaz Video AI, do you have a source?
And what YouTube video? Are we talking about the same project here?

GeneralAwareness

9 days ago

•

edited 9 days ago

All over the youtube creators, not just one. My mistake as I just fired up Topaz Video AI and theirs is called Theia.

Let's be honest with ourselves. When you see an image like this in the real world, there is going to be artifacts from compression as well.

Using an artificial means to manipulate an image to then upscale it is not a real world example, so once you bring in the artifact removal as well then we have something. At best this is only worth it for an AI generated image in something like ComfyUI and you just want to upscale it.

alebeck

Photogrammetry and Remote Sensing Lab of ETH Zurich org 9 days ago

No, low resolution and compression artifacts are orthogonal concepts. Our paper addresses (A)SR, not compression artifact removal. If your image has strong compression artifacts, this is not the right method to apply to it.

GeneralAwareness

8 days ago

Just saying reality versus paper analogies. This is not a real world ready project outside artificially generated low resolution images to then be upscaled. I am sure it will lead other devs to fork it to the next level for real world usage (think of the camera phone images from 20 years ago for instance).

alebeck

Photogrammetry and Remote Sensing Lab of ETH Zurich org about 1 hour ago

Not if "real world" means that the input image has JPEG artifacts, that's right, and I'm sure there's other models that deal with this issue specifically. Most cameras and smartphones, however, support uncompressed capture (i.e, RAW or high-quality JPEG/HEIC), which will then work fine with Thera. You are right that there is still potential for combining artifact removal and ASR in a single method in future work!

Regarding the original issue, I would like to emphasize that the demo is for demonstration purposes only, and not for production use of the model. High input resolutions exceed the computational resources of this space, and lead to degrading performance for other users of the space. We have therefore put the cropping logic in place, so that users can get an idea of the workings of the model, and if they want they can then clone the space or run it locally (or directly run the model locally, without Gradio), without any input size restrictions. Note that for a 300 x 300 pixels input image with x5 SR, the output already has more pixels than a full HD image. I am closing this issue since we're currently not expecting to acquire bigger hardware for this space.

alebeck changed discussion status to closed about 1 hour ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment