No I attempted a standard CNN that used the fullsize listing images, and it didn't detect (at least the way I modeled the data) the box accurately. It had a false positive for almost every image. Imagine scanning a bundled lot - that's gonna contain a lot of noise. Unless you have a large dataset, and a good way to structure the data, this isn't possible. You have however convinced me not to release the source code.
Sorry about that
Ok so it's not really easily achievable by the tools available at the moment, maybe later it would be, who knows. I love game changing (or game breaking) software like this, personally I've developed a couple of tools over the years to give me an advantage, nothing fancy at all, and certainly not AI.
Since I am not really familiar with the technology I'm just wondering about it, is looking for the NFR legend the simplest approach? How about instead of looking for the NFR legend, how about you first identify which game it is, once that is determined then compare with a NFR label of said game, maybe just compare the region where the NFR legend is expected, I wonder if that approach would be possible.
The thing is, there's only roughly ~25 (maybe a little more) N64 NFR carts, so there's no need to identify the game. You simply search for the title on eBay and they pop up.
The approach you suggested does not account for weird angles, torn labels, different ligting, etc. and like it would produce a ton of false-positives. This network is designed to look for features in the image, such as the shape of a cartridge (from multiple angles), aswell as the NFR label.
You could tilt the cartrige so the label is insanely stretched, but because it looks for features, and knows what a cartridge looks like, aswell as an NFR label from different angles, it'll have no trouble labeling it. This is a general solution to the problem of different camera quirks. However the suggestion you proposed has been tried before, and does (to some extent) work somtimes if implemented correctly.
- nfr
Thanks for answering my questions, I'm just floating ideas, most likely not going to try and develop anything but I find this very interesting
You mentioned the Nintendo logo causing a lot of false positives using a standard CNN, how difficult could it be to disregard bottom 20% of the label? Or to measure in percentage how far it is from the edges and determine that way if it is a NFR legend or the Nintendo logo. This most likely will not be valid for boxes, but I'd imagine more than 90% of the NFR listings found must be cart only and cart labels seem to have the Nintendo logo at predictable spots towards the bottom.
Hmm. I think your idea could maybe, potentially, get some results that are positive if you made a database of all possible NFR labels (Pokemon Snap, LoZ: OT, etc. has different NFR labels) and then moved that around in the x, y position and did a pixel similarity calculation. But even then, the lighting of the image vs. the label you're comparing to is gonna mess with your results. I think a better approach would be to look at the colors, since most NFR labels are a shade of red, see if you can make it detect rectangles that has sharp edges, with a red color. That's the only algorithmic way I can think of.
Also you're gonna need to transform images like this into a "square" image using linear algebra. If you use C# like me you may wanna look at AForgeNet (full disclaimer: I've never used it).
I would (and this is completely serious) like to see your progress using a purely algorithmic approach. It could be interesting to see if it could be done.
By the way, if you need help or have questions. Feel free to PM me or join the Discord I created for data gathering: https://discord.gg/Ncs33mQ...
Thanks a lot! I really appreciate all your input, going to check out AForgeNet, unfortunately I wish I had the time to play with this, maybe someday I'll get to it. I'll join the chat
Any algorithm like this is going to have a confusion table. I'm pretty curious how you set your tolerances, and whether or not you accept a higher false positive rate in favor of a reduced miss rate. Granted, with running real-world data, you don't know your miss rate, but when developing something like this you need both a training set and a test set to determine how reliable this is.
Also, any successes? I can't imagine that seller mistakes with NFRs is that common.
Although I did score a lot with 2 NFR Wario Blasts in it!
Any algorithm like this is going to have a confusion table. I'm pretty curious how you set your tolerances, and whether or not you accept a higher false positive rate in favor of a reduced miss rate. Granted, with running real-world data, you don't know your miss rate, but when developing something like this you need both a training set and a test set to determine how reliable this is.
Also, any successes? I can't imagine that seller mistakes with NFRs is that common.
Although I did score a lot with 2 NFR Wario Blasts in it!
They are rare indeed. But in my experience, it's more common to find Not For Resale games in regular listings by people who are oblivious to the rarity of the cartridge (most people just list it as the title, and ignore the NFR label).
I've stumbled across a few pretty rare titles that weren't listed as Not For Resale, so it definitely is possible.
As for the confusion matrix you mentioned, the network has a seperate dataset of the same games, but the regular version which it compare against (as described in the thread). I have yet to see a false positive.
Any algorithm like this is going to have a confusion table. I'm pretty curious how you set your tolerances, and whether or not you accept a higher false positive rate in favor of a reduced miss rate. Granted, with running real-world data, you don't know your miss rate, but when developing something like this you need both a training set and a test set to determine how reliable this is.
Also, any successes? I can't imagine that seller mistakes with NFRs is that common.
Although I did score a lot with 2 NFR Wario Blasts in it!
They are rare indeed. But in my experience, it's more common to find Not For Resale games in regular listings by people who are oblivious to the rarity of the cartridge (most people just list it as the title, and ignore the NFR label).
I've stumbled across a few pretty rare titles that weren't listed as Not For Resale, so it definitely is possible.
As for the confusion matrix you mentioned, the network has a seperate dataset of the same games, but the regular version which it compare against (as described in the thread). I have yet to see a false positive.
I find that surprising, and I feel like your tolerances are too low. Any classification algorithm is guaranteed to have some trade off between false positives and false negatives/true positives. I'd be concerned if I weren't getting any false positives, because it would suggest that I'm getting more false negatives.
Comments
No I attempted a standard CNN that used the fullsize listing images, and it didn't detect (at least the way I modeled the data) the box accurately. It had a false positive for almost every image. Imagine scanning a bundled lot - that's gonna contain a lot of noise. Unless you have a large dataset, and a good way to structure the data, this isn't possible. You have however convinced me not to release the source code.
Sorry about that
Ok so it's not really easily achievable by the tools available at the moment, maybe later it would be, who knows. I love game changing (or game breaking) software like this, personally I've developed a couple of tools over the years to give me an advantage, nothing fancy at all, and certainly not AI.
Since I am not really familiar with the technology I'm just wondering about it, is looking for the NFR legend the simplest approach? How about instead of looking for the NFR legend, how about you first identify which game it is, once that is determined then compare with a NFR label of said game, maybe just compare the region where the NFR legend is expected, I wonder if that approach would be possible.
The thing is, there's only roughly ~25 (maybe a little more) N64 NFR carts, so there's no need to identify the game. You simply search for the title on eBay and they pop up.
The approach you suggested does not account for weird angles, torn labels, different ligting, etc. and like it would produce a ton of false-positives. This network is designed to look for features in the image, such as the shape of a cartridge (from multiple angles), aswell as the NFR label.
You could tilt the cartrige so the label is insanely stretched, but because it looks for features, and knows what a cartridge looks like, aswell as an NFR label from different angles, it'll have no trouble labeling it. This is a general solution to the problem of different camera quirks. However the suggestion you proposed has been tried before, and does (to some extent) work somtimes if implemented correctly.
- nfr
Thanks for answering my questions, I'm just floating ideas, most likely not going to try and develop anything but I find this very interesting
You mentioned the Nintendo logo causing a lot of false positives using a standard CNN, how difficult could it be to disregard bottom 20% of the label? Or to measure in percentage how far it is from the edges and determine that way if it is a NFR legend or the Nintendo logo. This most likely will not be valid for boxes, but I'd imagine more than 90% of the NFR listings found must be cart only and cart labels seem to have the Nintendo logo at predictable spots towards the bottom.
Hmm. I think your idea could maybe, potentially, get some results that are positive if you made a database of all possible NFR labels (Pokemon Snap, LoZ: OT, etc. has different NFR labels) and then moved that around in the x, y position and did a pixel similarity calculation. But even then, the lighting of the image vs. the label you're comparing to is gonna mess with your results. I think a better approach would be to look at the colors, since most NFR labels are a shade of red, see if you can make it detect rectangles that has sharp edges, with a red color. That's the only algorithmic way I can think of.
Also you're gonna need to transform images like this into a "square" image using linear algebra. If you use C# like me you may wanna look at AForgeNet (full disclaimer: I've never used it).
I would (and this is completely serious) like to see your progress using a purely algorithmic approach. It could be interesting to see if it could be done.
By the way, if you need help or have questions. Feel free to PM me or join the Discord I created for data gathering: https://discord.gg/Ncs33mQ...
Thanks a lot! I really appreciate all your input, going to check out AForgeNet, unfortunately I wish I had the time to play with this, maybe someday I'll get to it. I'll join the chat
Also, any successes? I can't imagine that seller mistakes with NFRs is that common.
Although I did score a lot with 2 NFR Wario Blasts in it!
Any algorithm like this is going to have a confusion table. I'm pretty curious how you set your tolerances, and whether or not you accept a higher false positive rate in favor of a reduced miss rate. Granted, with running real-world data, you don't know your miss rate, but when developing something like this you need both a training set and a test set to determine how reliable this is.
Also, any successes? I can't imagine that seller mistakes with NFRs is that common.
Although I did score a lot with 2 NFR Wario Blasts in it!
They are rare indeed. But in my experience, it's more common to find Not For Resale games in regular listings by people who are oblivious to the rarity of the cartridge (most people just list it as the title, and ignore the NFR label).
I've stumbled across a few pretty rare titles that weren't listed as Not For Resale, so it definitely is possible.
As for the confusion matrix you mentioned, the network has a seperate dataset of the same games, but the regular version which it compare against (as described in the thread). I have yet to see a false positive.
Any algorithm like this is going to have a confusion table. I'm pretty curious how you set your tolerances, and whether or not you accept a higher false positive rate in favor of a reduced miss rate. Granted, with running real-world data, you don't know your miss rate, but when developing something like this you need both a training set and a test set to determine how reliable this is.
Also, any successes? I can't imagine that seller mistakes with NFRs is that common.
Although I did score a lot with 2 NFR Wario Blasts in it!
They are rare indeed. But in my experience, it's more common to find Not For Resale games in regular listings by people who are oblivious to the rarity of the cartridge (most people just list it as the title, and ignore the NFR label).
I've stumbled across a few pretty rare titles that weren't listed as Not For Resale, so it definitely is possible.
As for the confusion matrix you mentioned, the network has a seperate dataset of the same games, but the regular version which it compare against (as described in the thread). I have yet to see a false positive.
I find that surprising, and I feel like your tolerances are too low. Any classification algorithm is guaranteed to have some trade off between false positives and false negatives/true positives. I'd be concerned if I weren't getting any false positives, because it would suggest that I'm getting more false negatives.
Cool stuff, and nothing that you can really do about that situation.
I'd gladly sell this to a maximum of 5 people for 350$ per license. It includes an eBay monitor. Can be changed to fit your needs.
Does this include source so I can customize / extend it as I see fit (for my own use of course, not to resell)?
350$ in BTC.