Spaces:
Running
on
L4
Cropping absurdity
The mandatory cropping (applied even to very low resolution/small inputs!) appears to be entirely automated and does not take into account any compositional specificities of inputs. In practice, this makes this implementation absurdly unusable. For example, let's say I input a relatively small (say, 512x768) portrait of a person sitting. Your pre-processing script automatically crops out 75% of the image, including the subject's head and face, and instead renders a tile featuring an extreme zoom-in on the subject's hand. And because the image was already low-resolution in the first place (because, otherwise, why would I even use it in this Space), the cropping leaves me with a remainder fragment at an even lower resolution: so low, in fact, that likely no upscaler (and certainly not this implementation) would be able to improve it. I am aware that you've introduced the cropping script in the interest of not having those interest run into queues. And yes, I can guarantee that if the cropping feature stays in place, there wouldn't be any long queues here: but only because the frustration with the cropping would soon drive away all those interested.
Hi, it looks like your input image contained compression artifacts, which is something that Thera currently can't handle well. We had also added a note about that to the demo.
It's important to note that the scope of our paper was about arbitrary-scale super-resolution (ASR), and for that we assumed inputs without artifacts. At a later stage, we might re-train the model with such data, but so far this has not been the focus. Other methods for ASR don't do that either, currently.
That is the image it gave me back. Oh, well, this is simply not a viable solution for real world work as was being demonstrated in the YouTube videos.
Where did you see Thera in Topaz Video AI, do you have a source?
And what YouTube video? Are we talking about the same project here?
All over the youtube creators, not just one. My mistake as I just fired up Topaz Video AI and theirs is called Theia.
Let's be honest with ourselves. When you see an image like this in the real world, there is going to be artifacts from compression as well.
Using an artificial means to manipulate an image to then upscale it is not a real world example, so once you bring in the artifact removal as well then we have something. At best this is only worth it for an AI generated image in something like ComfyUI and you just want to upscale it.
No, low resolution and compression artifacts are orthogonal concepts. Our paper addresses (A)SR, not compression artifact removal. If your image has strong compression artifacts, this is not the right method to apply to it.
Just saying reality versus paper analogies. This is not a real world ready project outside artificially generated low resolution images to then be upscaled. I am sure it will lead other devs to fork it to the next level for real world usage (think of the camera phone images from 20 years ago for instance).
Not if "real world" means that the input image has JPEG artifacts, that's right, and I'm sure there's other models that deal with this issue specifically. Most cameras and smartphones, however, support uncompressed capture (i.e, RAW or high-quality JPEG/HEIC), which will then work fine with Thera. You are right that there is still potential for combining artifact removal and ASR in a single method in future work!
Regarding the original issue, I would like to emphasize that the demo is for demonstration purposes only, and not for production use of the model. High input resolutions exceed the computational resources of this space, and lead to degrading performance for other users of the space. We have therefore put the cropping logic in place, so that users can get an idea of the workings of the model, and if they want they can then clone the space or run it locally (or directly run the model locally, without Gradio), without any input size restrictions. Note that for a 300 x 300 pixels input image with x5 SR, the output already has more pixels than a full HD image. I am closing this issue since we're currently not expecting to acquire bigger hardware for this space.