Mike's Cube Code - Computerphile

  Рет қаралды 119,517

Computerphile

Computerphile

Күн бұрын

Coping with code to constantly count configurations of cubes can cause considerable consternation. Can Computerphile contributor Mike’s concoction continue calculating complete cube configurations or culminate in catastrophe?
Mike's Github link:
github.com/mikepound/cubes/
Repository that did this first a few years ago 😊
github.com/noelle-crawfish/En...
Wikipedia article:
en.wikipedia.org/wiki/Polycube
The page that enumerated up to n=16
Polyominoes Enumeration (kevingong.com)
kevingong.com/Polyominoes/Enum...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Пікірлер: 646
@ralphzajac8397
@ralphzajac8397 10 ай бұрын
"The Programmers' Credo: We do these things not because they are easy, but because we thought they were going to be easy,"
@aceman0000099
@aceman0000099 10 ай бұрын
*creed
@DavidLindes
@DavidLindes 9 ай бұрын
@@aceman0000099 they're synonyms... so... user's choice.
@aceman0000099
@aceman0000099 9 ай бұрын
@@DavidLindes credo is the word for creed in other languages, not English
@QuantumJump451
@QuantumJump451 9 ай бұрын
​@@aceman0000099really? It's in the dictionary, pretty sure they're both English words
@custard131
@custard131 10 ай бұрын
cant wait for the "Someone improved my code by 40,832,277,770%" follow up video
@letsgocamping88
@letsgocamping88 10 ай бұрын
I'm looking forward to the coding train version. And the chaos that ensues!
@knssoftware6018
@knssoftware6018 10 ай бұрын
Especially if it is Mat Parker who nails it.
@downstream0114
@downstream0114 10 ай бұрын
@@knssoftware6018 It's gonna be a Parker optimization, 40,832,277,770% slower.
@JoQeZzZ
@JoQeZzZ 10 ай бұрын
He at least used numpy, so that's already about a 1000x seedup from how matt parker would've done it.
@k0pstl939
@k0pstl939 10 ай бұрын
​@@JoQeZzZimport math as maths
@aikumaDK
@aikumaDK 10 ай бұрын
"should be easy enough to code up" "the record is for n=16" "okay, it's gonna be borderline impossible"
@adamsbja
@adamsbja 10 ай бұрын
I love these sort of ideas that are easy to grasp, get complicated, but get complicated in an understandable way. As soon as he starts laying out his method you can see why it explodes.
@error.418
@error.418 10 ай бұрын
love that the record is actually n=18 from 2006 "Counting d-Dimensional Polycubes and Nonrectangular Planar Polyominoes" by Gadi Aleksandrowicz & Gill Barequet (2006)
@bmitchem457
@bmitchem457 10 ай бұрын
"Starting programming it thinking it'd be quite trivial" Famous last words
@pdrg
@pdrg 10 ай бұрын
Laughs in RegEx
@carlociarrocchi2793
@carlociarrocchi2793 10 ай бұрын
That's what happens with all of my projects.
@gilly_the_fish
@gilly_the_fish 10 ай бұрын
I call them "famous first words", because that's how many of my projects have started. "Oh this should be trivial" *disappears for 3 weeks* "Guess I was wrong"
@snack711
@snack711 9 ай бұрын
happens to me everytime, starting to slowly get more realistic about it
@tommydowning3481
@tommydowning3481 10 ай бұрын
What I like about Mike's videos is not only the content, but the genuine enthusiasm he brings to the table. It's very enjoyable and relatable. Great stuff.
@amycoscyrus5740
@amycoscyrus5740 10 ай бұрын
He really shows that he loves his job and the work with numberphile
@michaelpound9891
@michaelpound9891 10 ай бұрын
I can’t express to everyone how happy all these fantastic comments make me :D Get coding people we’ve got a record to set!
@sephirapple7317
@sephirapple7317 10 ай бұрын
@michaelpound9891 Hey, I don't know if you've already considered this, but computing 3d transformations can often be easier by using 4 coordinates instead of the traditional x y z system. I don't know if this would work in this case, maybe you've already tried using quaternions, but if you haven't it may be a helpful tool here, since they often make computing/encoding 3d rotations much easier. Hope this helps somehow, I don't know enough about the problem to say how it might be implemented here but since quaternions are often helpful for rotation I thought it might be worth mentioning in the hopes it could be useful to you, best of luck with it either way! 🙂🍀
@PedroContipelli2
@PedroContipelli2 10 ай бұрын
Apparently the record will be harder than previously thought! Current record is actually n = 18, computed in "Counting d-Dimensional Polycubes and Nonrectangular Planar Polyominoes" by Gadi Aleksandrowicz & Gill Barequet (2006). Wikipedia page has been edited! n = 19 next?
@error.418
@error.418 10 ай бұрын
we're not your personal coding army
@aloluk
@aloluk 10 ай бұрын
I would but I dont have spare time now a days.
@napukapu
@napukapu 10 ай бұрын
You are the GOAT lecturer of Computerphile. I subscribe for Mike's wisdom.
@YingwuUsagiri
@YingwuUsagiri 10 ай бұрын
Mike is just so relatable. Way too often has a colleague said something and then me going hmmmmm I can do that right? Should take fifteen minutes. And then it took six hours, many pages of research and a ginormous headache and it will absolutely be a one trick pony solution that will never ever see the light of day again outside of this one usage.
@roberttalada5196
@roberttalada5196 10 ай бұрын
That’s maths for you
@goat5249
@goat5249 10 ай бұрын
This video has been up for over an hour... So has the world record been beaten yet?
@shouldb.studying4670
@shouldb.studying4670 10 ай бұрын
+
@Alex-ck4in
@Alex-ck4in 9 ай бұрын
I'm tempted to take a stab (and fail 😂)
@whenindoubt
@whenindoubt 10 ай бұрын
This is a graph-coloring problem. The transform-invariant representation of a polycube is a cube-adjacency graph. Each polycube record also stores a mapping from (axis, direction) to a "color" (index), and each adjacency edge is "colored" by the (axis, direction) tuple. Two polycubes are identical iff their mappings can be reconciled such that they can be colored equivalently. (Corrections for chirality can be made by imposing ordering restrictions on the color mapping.) I don't think there is a characteristic graph encoding that will produce a unique hash value (this is what makes the problem interesting). But an approximation that will converge to an acceptably-small set of candidates is the number of vertices of each degree and number of edges of each color, sorted.
@juancarlosvasquez1865
@juancarlosvasquez1865 10 ай бұрын
Great comment. As the latest fields medals on combinatorics, it is always a representation bottleneck problem. Like Feynman Lagrangian-Quantum Mechanics based in least action principle against the Schrodinger Equation Hamiltonian approach.
@DavidBerglund
@DavidBerglund 10 ай бұрын
Going to go a thumbs up on this because it sounds plausible but having basically no math background, this could just as well be that I'm being trolled on the internet again... 😂 But no this sounds like such an interesting field to think about. So, combinatorics, is that it?
@jamesrivettcarnac
@jamesrivettcarnac 10 ай бұрын
What I came for
@alegian7934
@alegian7934 10 ай бұрын
so you're saying this boils down to graph isomorphism?
@CatoSierraWasTaken
@CatoSierraWasTaken 10 ай бұрын
How do you represent chirality using graph coloring?
@liliwheeler2204
@liliwheeler2204 10 ай бұрын
I never know whether to feel reassured or concerned that the struggle of "I thought this would take 15 minutes but actually it was 8+ hours" continues all the way up through every level of programming experience.....
@CTimmerman
@CTimmerman 10 ай бұрын
That's mainly due to not knowing every detail about the libraries you're using, and the lack of people doing automated regression tests with 100% coverage. A new Pillow bug broke one fuzzing instance of my steganography program for example, but i don't care enough to monitor that after linking my test in their issue tracker.
@squishmastah4682
@squishmastah4682 10 ай бұрын
When you're randomly browsing KZbin and Mike drops some code in your eyeballs...
@damianocaprari6991
@damianocaprari6991 10 ай бұрын
That's the good stuff
@GeorgeBratley
@GeorgeBratley 10 ай бұрын
Please do a follow up looking at some of the techniques people used to make this more efficient!
@bogdannikolakov2159
@bogdannikolakov2159 10 ай бұрын
It’s been four hours now, so someone has already improved the algorithm by 100x, CUDA’d the shit out of it and ran it on a 2000 GPU server farm
@AlphaPhoenixChannel
@AlphaPhoenixChannel 10 ай бұрын
Man watching this immediately before bed was a bad idea… I WAS planning to go to sleep, but now I’m laying in front of a sheet of 11x17 agonizing about chirality
@landsgevaer
@landsgevaer 10 ай бұрын
If anybody wonders, this is OEIS sequence A000162. YT doesn't like me posting the link, but that should suffice to find it.
@BenJamin-rt7ui
@BenJamin-rt7ui 10 ай бұрын
Turned on my radio today. Imagine my surprise to hear Dr Pound enlightening the listeners on the subject of AI.
@michaelwilson5742
@michaelwilson5742 10 ай бұрын
Not as big as ours, as our press office gave it the same level of coverage as when he was on bbc click. Which is zero. 😮
@BooBaddyBig
@BooBaddyBig 10 ай бұрын
One interesting thing would be to find the center of mass of the shape and make the coordinates from there. You have to come up with a hash that can be quickly updated as the centre of mass moves around. And you could use the centre of mass as part of the hash, if the centre of mass can't line up, then it's a different object already. For example if the CofM is at (1.2, 3) but another is at (1.5, 4) then you know they cannot match because no square rotations or integer translations can move 1.5 to 1.2.
@GuentherJongeling-hu6oe
@GuentherJongeling-hu6oe 10 ай бұрын
Another idea for the notation I had is from some ideas of organical chemistry, where there is a naming scheme which uses the longest chain of Carbon atoms in a molecule as a base, and has all side-chains named in a specific way too. For relatively simple structures like these cube structures are, there is only one way to name it. This would require computation of the chain lengths, but could maybe account for rotation. This is quite a bit harder in 3D probably, but maybe with some modifications it might work.
@pettere8429
@pettere8429 10 ай бұрын
Use the largest x-by-y-by-z block in place of the longest chain and then number the possible attachment points for features in a systematic way and it should work.
@Ylyrra
@Ylyrra 10 ай бұрын
The issue with normalisation like that is whether calculating the normal form takes longer than just doing the fairly inexpensive rotation and hashing. In aggregate the rotation and hashing is expensive, but individually it isn't giving you a great deal of budget to find an improvement in. I'm not saying this approach is wrong, it's where I'd start looking too, it's just that the solution needs to be REALLY simple and fast to be an improvement.
@kristiankjaergaard
@kristiankjaergaard 10 ай бұрын
Geometric mean followed by principal component analysis?
@segganew
@segganew 5 ай бұрын
Heh I thought of something like SMILES too
@johnrozewicki7132
@johnrozewicki7132 10 ай бұрын
I worked on this problem for protein comparison as part of my academic research. You can store these in a rotationally invariant way by encoding subsequent cubes in reference to an "origin" cube and not in global coord space.
@bastscho
@bastscho 10 ай бұрын
However this does not work unless i am missing something. Suppose we have 🟥⬜ where the red cube is the origin block. Obviously the white block is directly connected to the redblock with distance 1. Thus ⬜ 🟥 Would immediately be detected as an equivalent to the first. But now we build further to 🟥⬜⬜, where the third block has distance two from the origin in the east direction. However we can also build ⬜🟥⬜, with two blocks of distance one. Now both of these are equivalent, as we do not color the blocks. But your system may not detect this.
@NoomerLoL
@NoomerLoL 10 ай бұрын
Perhaps you could align the 3D object by its dominant orientation (similar to how it's done for Scale-invariant feature transformations) and then store the rotation invariant hash of your encoding, that way you only have to look up one hash per object. No idea how efficient this would be, though.
@rich1051414
@rich1051414 10 ай бұрын
A foolproof way to align any shapes rotations to a deterministic orientation that requires less computation than actually rotating all the different possible rotations would be the way. Otherwise, you could just use a standard unique hash function and simply assign the min/max hash of all of those rotations, for the duplicate prevention.
@brunobastos5533
@brunobastos5533 10 ай бұрын
better than say is doing it , he got the code in git or can start a new one , is great exercise
@oasntet
@oasntet 10 ай бұрын
Hash lookups aren't actually all that expensive, and it's a finite number of them regardless of population size. So long as the lookup remains constant time (and there were cases in Python where this stops being true, but I can't speak to current Python or numpy) then the only difference is memory usage. That's not nothing, but at the n=12 polycube it's probably not a major concern. I'm betting either the hash lookup needs a tiny tweak to return it to constant time, or some other part of the algorithm is the real issue.
@rsv9999
@rsv9999 10 ай бұрын
then you can use this knowledge of all 24 rotations to store if the object is equivalent on each side, reducing the amount of times you add a cube to both sides of the same shape, needing to go through all 24 iterations more. potentially this could even eliminate all duplicate objects, although I doubt that's true but would take too much time for me to think about.
@NicholasThompson1982
@NicholasThompson1982 10 ай бұрын
This sound similar to my idea of representing the shape’s 0’s and 1’s as a binary number, apply all rotations and pick the largest/smallest one as the primary hashed version. In fact… why even hash it if it’s just a very large integer? Could probably even avoid casting to an int if you just binary sort 24 long-binary numbers?
@hanifarroisimukhlis5989
@hanifarroisimukhlis5989 10 ай бұрын
My thinking is to "canonicalize" the shape such that for every rotation it resolves to the same shape (therefore same hash). It might be done by first rotating until (block sizes) x >= y >= z, then rotating each axis 180 degrees, minimizing center of mass coordinate. In the end the orientation will minimize center of mass coordinate, and since there ins only one way it can be done, it's canonical. EDIT: It's a bit moe efficient, since it only took 5 rotation at most rather than trying all 24 rotations.
@kindlin
@kindlin 10 ай бұрын
I literally just got done typing this and scrolled down 1 more screen. And it would be much more efficient, as with a canonical rotation, you only compare 1 shape against a million others, not 24 different rotations against a millions others.
@Winasaurus
@Winasaurus 10 ай бұрын
"Now you've got a kind of t-shape, whatever that is called in tetris, I'm sure it's got a name" It's just called a t-piece. They're all named after the shape they look like so T, S, Z, J, L and I.
@oasntet
@oasntet 10 ай бұрын
In before somebody cites that photoshopped tetris manual that was a meme four years ago.
@Dayanto
@Dayanto 10 ай бұрын
A trivial way to find a universal encoding is to simply compare the encodings for all the rotations and consistently picking the one with the lowest value. This always yields the same result regardless of the initial rotation. This requires 24x less space and lookups, but the tricky part is optimizing the encoding step to deal with all the rotations. The key here is that since we're only looking for one representation for all symmetries, we can aggressively prune other options as early as possible, before doing most of the heavy work. This has a theoretical upper bound of 24x faster than the original by disregarding irrelevant symmetries. In addition to this, you can represent the longest dimension as a bit field (uint) for an additional 5-16x speedup (when n=16) on allocations, encoding, comparisons etc. This might also be a bit more compact than a run length encoding. Note however that this requires special handling for equally sized dimensions since it only works in one direction. All in all, combining all of these tricks, you might in theory be able to speed up the algorithm by >100x. In reality though, it's probably much less on average.
@MNbenMN
@MNbenMN 10 ай бұрын
I was about to write down an algorithm to simplify the representations as a series of vectors, but realized it would only work (with implicit unique representations) for shapes that don't branch and don't form loops. I get why Mike said a bunch of people said it would be easy, but never got back with an algorithm.
@marsovac
@marsovac 10 ай бұрын
To account for rotation/flips use the hash that is "an array of distances of each cube to every other cube in the shape, sorted". This should be unique for each. If every distance of every building block to all others is the same, it should be the same shape, possibly flipped or rotated, at least in Euclidean geometry. For 20 cubes that would be 210 simple arithmetic operations + a sort on 20 numbers. Still probably much faster than doing the rotations.
@AkantorJojo
@AkantorJojo 10 ай бұрын
This would miss out on chirality. As "lefthanded" and "righthanded" shapes would give you the same numbers despite beeing different - as they can't be rotated into each other. But otherwise I really like the idea :D
@letsgocamping88
@letsgocamping88 10 ай бұрын
​@@AkantorJojosurely the left handed version of a shape is in fact a new shape?
@1paulromero
@1paulromero 10 ай бұрын
​@@letsgocamping88not in 3d
@Bunny501
@Bunny501 10 ай бұрын
​@@1paulromeroYes they are different shapes. Your left hand can't be rotated to be the same shape as your right hand.
@Klisteristhashit
@Klisteristhashit 10 ай бұрын
⁠@@letsgocamping88that is what @AkantorJojo is saying. But the proposed method, using Euclidean distance between the cubes, would not identify a right handed and left handed shape as different. Because all distances when sorted would be identical.
@Philiquaz
@Philiquaz 10 ай бұрын
Small thought for translation - this doesn't play a factor when there's only a few cubes, because those also fall under rotations. But translational symmetry will start catching you out once you're at 4x4x4 or larger. There's a lot of ground to be made in skipping over entire branches/systems that will always produce symmetric results. A comment earlier mentioned storing the moment of inertia as a quick hash for faster lookup - and another solution was aligning to a "primary" orientation to avoid rotational symmetries. It's a rough problem for sure.
@rmsgrey
@rmsgrey 10 ай бұрын
He mentioned that he's already clipping the shapes to their bounding cuboid, which eliminates translation.
@MrSonny6155
@MrSonny6155 10 ай бұрын
Of course, it will be a trade off between computing the orientation hash vs your poor sticks of RAM that will soon be seeing the lights from above.
@hurktang
@hurktang 10 ай бұрын
Just brainstorming in here... We could describe all shapes by starting at a point and moving in all the dimension, so you can go (u)p, (d)own, (r)ight, (l)eft,(f)orward or (b)ackward. You can also re(w)ind a block back to start a second branch from a previous block. With those sets of 7 moves, you can describe all possible shapes. Then we can add building rules so you always go up first, then right, then forward, etc... and do the least amount of rewind possible. So all shapes would have 1 preferred code for it. The Z shape would be URRU and the T shape would be UURWL. All shapes could be described with this 8 base system. And, I might be wrong, but that might be easy to handle with the proper code. Only having to test 1 orientation per shape, having low memory cost in the list for shapes and above all probably be able to rule out a ton of shapes before even testing it against the list.
@randEveScrub
@randEveScrub 10 ай бұрын
6 years ago I was asked to implement this in a 1 hour google interview... it has been haunting me ever since
@Imperial_Squid
@Imperial_Squid 10 ай бұрын
Editing the wiki page in two weeks "All 17 polycube combinations were discovered by Mike Pound (via outsourcing to the computerphile community)"
@yogeshsingla131
@yogeshsingla131 10 ай бұрын
Here’s my hypothesis without using brute force matching of all 24 variations. For the shapes, to match the number of cubes must be same. That is, the volume is same. If the shapes are same, their centre of inertia has to be same. Each cube is reduced to its centre and the entire shapes centre of inertia can be calculated and checked in 3D floating point space.
@EdwardChan.999
@EdwardChan.999 10 ай бұрын
The position of the center of mass as well. If you have an L shape, adding a cube to the top left or bottom left or top right would change the moment of inertia by the same amount, since it's just the sum of each mass times radius. But then there's the problem of tracking center of mass after rotation.
@speedstyle.
@speedstyle. 10 ай бұрын
There are multiple shapes with the same bounding box, volume, and centre of mass. For example a 4x3 rectangle with empty insides ⠯⠽ and one with only the middle filled ⠽⠯
@pmmeurcatpics
@pmmeurcatpics 10 ай бұрын
@@speedstyle. Yes, but after using this rough filter what you'll be left with is a (hopefully, much) smaller number of cubes to search through
@speedstyle.
@speedstyle. 10 ай бұрын
But it doesn't help with the rotation? and hash tables don't care how many options you're searching through.
@VivekYadav-ds8oz
@VivekYadav-ds8oz 10 ай бұрын
i was thinking something similar but much dumber. You just take the average of all the coordinates of all the cubes. Now, with this "centre" of the shape as the origin, write out the coordinate of each of those squares in a list, and sort it according to some order. Now hash this value, and it should remain the same for all the rotations and translations.
@richardikenna1124
@richardikenna1124 10 ай бұрын
Even when I don't get it sometimes, mikes enthusiasm makes it fun. Thank you
@KuroOnehalf
@KuroOnehalf 10 ай бұрын
I'd love more videos like this. Working through problems like this is fun.
@aungthuhein007
@aungthuhein007 10 ай бұрын
Absolutely love seeing Mike talking every time!
@poorman-trending
@poorman-trending 10 ай бұрын
Instead of hashing you can define a standard rotation and rotate each new object into that orientation before looking up. For example, as each array could be considered as a binary number, rotate each new shape 24 ways and choose the orientation with the smallest binary representation. Then lookup against the rest of the previously stored shapes. Not sure if this would be faster or not but its an alternative that may be.
@rsv9999
@rsv9999 10 ай бұрын
had the same idea matey, i was thinking of storing the addition information for mirroring on an axis, meaning adding to that mirrored axis would cause the opposite side of the object to result in the same new object, reducing the need to search since you've already noticed that it's the same shape, a little caching really
@gone357
@gone357 10 ай бұрын
@@rsv9999 what if you looked from the 6 faces and compared the proyections?
@xiexingwu
@xiexingwu 10 ай бұрын
@@gone357in 3 dimensions, there may be holes in large structures that don't show via the projections.
@vladimirfokow6420
@vladimirfokow6420 10 ай бұрын
Had exactly the same idea👍 This function takes in a shape of cubes, and outputs a number that uniquely identifies this shape. Any other rotation of this shape will produce the same id, and a different shape will definitely produce a different id. - This eliminates any collisions that the usual hashing can have. - And this is the way to produce a rotationally invariant id for the shapes. - (and if you also store the dimensions beside this id, you could reconstruct the object from the id - just reshaping this array from flat to having these dimensions)
@matsv201
@matsv201 10 ай бұрын
You can't really do that because before you checked all of them, you don't know what is the smallest one. A way around this is using COG as a way to orient the the part. They you would instantly know if it's the primary orientation.
@coderentity2079
@coderentity2079 10 ай бұрын
Divide the cube that contains the shape into 8 cubes (half size on all axes), count the voxels in them, store the rotation that produces the smallest number and store the chosen rotation (0-23). Do the same for all 8 sub-cubes recursively. To find out if you already have a certain cube becomes easy: do the same dividing/counting/rotating with it and you can compare/search with a B tree or hash - when match found do this for the sub-cubes too, but remember to match the rotation too in them. You basically look for cubes with the same density distribution in finer and finer resolution.
@ChiefArug
@ChiefArug 10 ай бұрын
i love problem solving videos like these! please do a follow up if significant improvements are found
@AF-lt2fr
@AF-lt2fr 10 ай бұрын
If you assign the starting origin as 0,0 and apply the rules of no diagonals and must be connected - there's no need to do flips or rotations. It's only when adding the third dimension - e.g. 1,2 is not the same as 1,1,1. So for 4 you could have, 1,1,1,1 1,1,2 1,2,1 1,2,1 (2 not touching) etc. Since there's a maximum of 6 faces which could be joined to it there's a finite amount of faces you can attach to. E.g. for 1 cube there's 0 faces For 2 cubes - 1 face For 3 cubes 1 face + 2 faces 4 cubes As 3 + 4 Etc.
@slimeboos4065
@slimeboos4065 10 ай бұрын
I have a suspicion that this could easily become an advent of code problem for 2023
@tails4e
@tails4e 10 ай бұрын
This fear crossed my mind too 😅
@Revenga849
@Revenga849 10 ай бұрын
day 18 of 2022 already has similar vibes with polycube surface calculation
@Zach010ROBLOX
@Zach010ROBLOX 10 ай бұрын
Now I want to try this. First naive approach will be to think about the max space taken up by all orthogonal combinations (where a 3x3 would be a d8 shape of voxels/cubes) then use some theory to reduce the number of possible shapes that fit in that space (like dividing by number of symmetries). Lastly, the trick would be finding the number of diagonally attached shapes still possible in the space. Or some variation of these steps and other methods of removal.
@BeCurieUs
@BeCurieUs 10 ай бұрын
My brain instantly went to nodes vs actually representing these things in space. I guess the issue with nodes, though, is they lack the geometric identidy we need to identify similar looking shapes. The basic idea I had was you could limit the nods to 6 connections representing the 6 spatial directions in a cube. How to get all the info we want out of that configuration requires a bit more thought :D
@leo_carlini
@leo_carlini 10 ай бұрын
Amazing ! I'm looking forward to a follow up video !
@seanm7445
@seanm7445 10 ай бұрын
My first thought would be to save each shape at a specific rotation. Some common rule such as ‘close bounding box, put largest dimension at x, smallest dimension at y...’ So technically each shape will end up the same. Then for each check, you first put it into that specific rotation and then lookup the hashtmap.
@holthuizenoemoet591
@holthuizenoemoet591 10 ай бұрын
how would you find the common rotation, once a new object is made, what rotation does it have?
@matsv201
@matsv201 10 ай бұрын
Yea.. X>=Y>=Y, that eliminates 8 possibilities out of the 24. Ad COG values equal or larger than 0 for X,Y,Z then every object is in the same orientations as well as rotation. 24 is down to only one possibility non of the others even need to be checked, just discarded. That as a extra bonus, now they can be indexed by numbers, making them sortable. That will speed up the look up table but several orders of magnitude.
@kindlin
@kindlin 10 ай бұрын
I had the same thought, but then also realized the obvious problem of square bounding boxes, that leaves a whole dimension (or two) undefined. I think the solution might be simple, tho, compute the center of mass (volume/cubes/etc) and if it's off center, use that direction as one of the axis, and if it's not off-center, reduce the bounding box you check consistently until it becomes off center, and if every cube of cubes is the same, then it's the same shape anyways.
@skleon
@skleon 10 ай бұрын
I believe Zobrist hashing would be a very promising approach. It's what chess engines use to hash chess positions to build a transposition table. Given a previous shape with its hash, it's very fast to calculate the hash of a new shape that only has one more square in it.
@aditya95sriram
@aditya95sriram 10 ай бұрын
I looked up the Wikipedia article, it's an interesting technique. However, the hash does nothing to account for the different symmetries (rotation, translation, mirror etc) which makes sense because in chess you wouldn't really worry about those symmetries.
@BooBaddyBig
@BooBaddyBig 10 ай бұрын
I think you'd be better off using a non random hash based on modulo arithmetic. That way multiplying by a number can be the same as translation. So each square has a number, and multiplying by 3 different prime numbers is equivalent to moving in the x, y, and z direction (modulo a larger prime number) respectively. Provided the prime numbers are bigger than the maximum shape and coprime, then the hash should be completely unique. You just add on the number for that square.
@tonyd6853
@tonyd6853 6 ай бұрын
The more accessible you can make these for visually impaired folks. The more people can listen to these on their daily commute while their visual faculties are otherwise consumed.
@CD4017BE
@CD4017BE 10 ай бұрын
You can get a hash that is orientation independent by using an arbitrary hash-function and apply it for all 24 orientations of the shape and then compute the sum (of any other commutative and associative operation) of these hashes. Another optimization would be, instead of 3D arrays of bytes, to use 2D arrays of 16-bit integers (no axis will ever need more than 16 unless you break the record), storing the last axis in the individual bits. This both saves memory and makes comparison faster, but it makes rotation more complicated.
@d34dc0de
@d34dc0de 10 ай бұрын
I was looking for a comment if someone had thought of this 😄My though process was more of computing the hash of the concatenation of all rotation as a binary sequence Edit: hexadecimal usernames ftw !
@garethdean6382
@garethdean6382 10 ай бұрын
Having done organic chemistry my mind automatically went to carbon compounds and some code I once idly made to calculate possible alkanes. If each cube is represented as 'c' and a face as 'f' you can write a formula for a possible cube. The first one-cube monomino is CF6, the two-cube domino is C2F10 (F5C-CF5) and so on. Deriving a formula is simple enough. For repetitions you only need check identical formulae. (A block of four cubes is not the same as a line of four cubes though an L and a line of four cubes would be.) There's more to it but boy am I too lazy now to resurrect a decade old project. Kudos to you, math-man!
@Spongman
@Spongman 10 ай бұрын
Sean's question at 8:34 is right on the money. It's significantly cheaper to calculate the rotations at the time you add a new unknown shape than it is every time you query for existence. the added storage (24x) and lookup time would be insignificant.
@foo0815
@foo0815 10 ай бұрын
@@chrismanning5232 Lookup time is insignificant, 24 times Storage quite feasible, it's much less than the branch factor for one more cube
@Spongman
@Spongman 10 ай бұрын
@@chrismanning5232 I wouldn't characterize "multiply by 24" as an explosion. sure, it's more. but it's a constant factor.
@joewakeling6867
@joewakeling6867 10 ай бұрын
I was once looking for a rotation and reflection agnostic labelling convention for 2D grids of on/off cells. My system was to place the pattern in the smallest n×n square that would contain it. Then label the pattern by starting from one of the corners and going clockwise or anticlockwise recording each on/off cell with a 1/0, repeating the process in the same direction from the cell diagonally inwards from the cell you started from. The trick to uniqueness is how you select the starting corner and direction. In essence, this was chosen so that interpreting the label as a binary string gave the biggest number possible. So first, if you can translate the pattern at all without it leaving the square, try to put an 'on' cell in a corner so you can start from there. Failing that, you want the longest string of successive 'on' cells leading away from a corner, or if there are multiple equal length strings of successive 'on' cells, pick whichever is followed by the shortest string of 'off' cells before the next 'on', and continue in this way until there's a unique choice of translation, start corner and direction that maximises the value of the binary string, or multiple traversals have traversed the whole square and therefore have the same labelling(i.e. the pattern has some symmetry) Not sure if this is any more efficient than what others in these comments have suggested about just doing a hash for all rotations and picking the min/max. Suspect you could change my technique to use rectangles rather than squares and remove the translation question. Doubt you could use this very easily in 3D since that gives a lot more traversal options
@forTodaysAdventure
@forTodaysAdventure 10 ай бұрын
1. start at 1 block and add a block to it at each face, then add a block to each one of those faces like you were saying, to build up the tree of possibilities 2. to hash the possibilities, start at the lowest block (the one closest to 0,0,0) and ask what combination of blocks does it have around it. for example, 1 neighbor is a 1, 2 neighbors on opposite sides is a 2, 2 neighbors touching is a 3. 2.5 attempt to map this to a hamiltonian path (one where you can travel to every block without backtracking) 2.6go to the next neighbor block (next neighbor is a touching block that is next closest to the origin, prioritizing closest to 0=x) 2.75 if you have to backtrack it, denote that in the hash by going into the decimals 3. convert that block to a hash and compare it to your hash table continue until you have n blocks
@happygimp0
@happygimp0 9 ай бұрын
For the rotating problem: When you generate a new structure, rotate it in every way and store the resulting rotations as a integer (or list or string, doesn't matter that much). And then only use the one with the lowest (or highest if you prefer that) one. And then only use this one rotation to see if it is in the list of all generated structures and if not only store that rotation. Of course, even faster would be a way to only generate each structure once without keeping and checking a list of all generated ones. That would also help a lot for paralleling it on multiple threads or (a) GPU(s).
@davidmurphy563
@davidmurphy563 7 ай бұрын
There's an obvious way to reduce the lookup. You can use a vectorised encoding. Basically, the idea is just to ensure discarding the absolute position info: 1. Subdivide in half, pick the side with less boxes but with at least one. Repeat. Tree check to the least box box. Don't use recursion, just loop. 2. Describe the object with a vector path string. Up then right the forward. Make sure you cull the search, up then right means you can skip right then up. 3. Search this against your set. You see? We're basically storing a graph/vector so orientation info is discarded. If we use the same algorithm to select the starting cube and then subsequent cubes then rotated and flipped cubes will give the same output. Worst case there could be objects with have symmetry resulting in a reversed path but I'd just run the list and then cull it after. That should be easy to pick up.
@Imevul
@Imevul 10 ай бұрын
My brain directly went "Well that's going to be annoying to represent in memory", and the remaining 15 minutes just confirmed that. I was thinking maybe you could represent the shape as a branching path through 3D space, where you would just say "from the previous cube, the next cube is 'forward' and the cube after that you turn 'up' and that will be the new forward", but I still don't think that will solve the rotation issue since it would depend on where in the shape you start pathing, and how you define "up" or "left" etc. from that point.
@royvanrijn
@royvanrijn 9 ай бұрын
Wrote some code to calculate the shapes and my runtimes are (in Java). I'm using a canonical way to store the "smallest" of each of the 24 rotations. This "smallest" is just arbitrarily, as long as it's consistent. I've found that just using vectors (0,0,1),(0,1,2) etc was easier to work with and faster than using bitmaps. Especially when combined with a kind of "snake" 3D index number from a space-filling curve. In my numbering system all numbers are positive (the shape is placed in the (0,0,0) origin), so that is index 0, than (0,0,1)=1, (0,1,0)=2, (1,0,0)=3, (0,0,2)=4 etc. This way storing the entire shape is a series of very small numbers, unique and compressable. This is the current runtime/speed (the counted shapes seem to match the expected values): Written file shapes-2: 1 Last run took: 12 ms Written file shapes-3: 2 Last run took: 3 ms Written file shapes-4: 8 Last run took: 2 ms Written file shapes-5: 29 Last run took: 3 ms Written file shapes-6: 166 Last run took: 7 ms Written file shapes-7: 1023 Last run took: 25 ms Written file shapes-8: 6922 Last run took: 108 ms Written file shapes-9: 48311 Last run took: 544 ms Written file shapes-10: 346543 Last run took: 3828 ms Written file shapes-11: 2522522 Last run took: 30524 ms Written file shapes-12: 18598427 Last run took: 255812 ms Written file shapes-13: 138462649 Last run took: 2406329 ms Written file shapes-14: 1039496297 Last run took: 20646680 ms (stopped it here for now, it is saving to disk, could continue indefinitely)
@0xdeadbeef444
@0xdeadbeef444 10 ай бұрын
There's got to be a way to reduce this to a graph problem. Maybe storing the poly cubes as graphs/trees?
@jannicklumme9298
@jannicklumme9298 10 ай бұрын
Thought about that too and then look up if the one in memory and the new one are isomorph. It might be that it is too slow due to graph creation overhead and isnt something like checking for isomorphism in O(n^2) or O(n^3)? Dont know the time complexity of his current strategy or other viable strategies
@aditya95sriram
@aditya95sriram 10 ай бұрын
I think converting to a graph might lose some information. For example, in 2-d, with 3 cubes, an L-shape and straight line might get converted into identical graphs right (i.e., straight path with 3 vertices)?
@nolifeorname5731
@nolifeorname5731 10 ай бұрын
@@jannicklumme9298 I had the same idea and checked. Supposedly the current state of the art for general graph isomorphism is 2^O((log n)^3) EDIT: This problem actually has a bounded valence, so apparently it's solvable in polynomial time
@drdca8263
@drdca8263 10 ай бұрын
@@aditya95sriram what about a decorated graph?
@mtardibu
@mtardibu 10 ай бұрын
That's my plan. I'm also planning on only developing the tree in the order of a BFS.
@AuratticStride
@AuratticStride 10 ай бұрын
One simple way to manage the hashing is when you add the object to the list the first time, hash the object in all possible orientations. Then, when you're looking for the object you only need to hash once, as if it exists it should be in one of the 24 previously found orientations
@wbfaulk
@wbfaulk 10 ай бұрын
As suggested by Sean in the video at 8:34, to which Mike commented that you're trading compute time for memory consumption.
@erikstv7802
@erikstv7802 10 ай бұрын
Or even just keep track of e.g. the minimum of those 24 hashes: Then your hash table won't be any larger. You could perhaps do a lookup already after having done a few of the rotations to stop that loop early (depending on how slow the rotations are versus the lookups). But since there's logics and non-numpy loops here, I expect this exercise to perform better with numba than with numpy (while working within the Python ecosystem).
@theOtherNism
@theOtherNism 10 ай бұрын
I had a similar idea, but it would be more efficient to pick, from the 24 byte sequences that represent the rotations, the one that has the smallest value when represented as a number. Then store the hash of that. (although I'm not sure if storing the hash instead of the original byte sequence has any advantage.)
@TheFinagle
@TheFinagle 10 ай бұрын
For checking rotations one thing you could do is Come up with a way to 'score' the shapes, the 'lowest scoring' rotation or reflection is the main form. For each new shape you only need to check the 24 reflections and rotations to see which scores lowest, then check if that specific form is in your list. One simple way to score is to shift it against the grid axis and string out the cells as the digits in a binary number. Every possibility of cell arrangement would have a unique number.
@PeterBeresford-hf9iz
@PeterBeresford-hf9iz 9 ай бұрын
Rather than using matrices I represented a shape as a Python list. The enclosing box of a shape is a list of layers, each layer being a list of rows, and each row is an integer where each bit is a column, set to 1 if filled by a cube or 0 if empty. E.g. the 2 shapes for N=3 are [[7]] and [[1, 3]]. Adding a cube to an N-shape simply changes the row value, if internal to the box, or adds a new layer, row or column if external. Thus adding a cube to the side of [[1, 3]] gives [[1, 3], [0, 2]]. Given a candidate shape, each of the 24 views is a new list, obtained by rotation, flip or tilt. Valid shapes have all 24 lists added to a set, and if a new candidate is in this set it is discarded immediately. The count of shapes for N=10 took 10 minutes to calculate on my laptop. Extrapolation indicates that the time for N=12 would be about 24 hours. Mike's comments on Python's built-in hashing are borne out by this speed. All is achieved by manipulating lists (which are mutable) and a list's string representation is used for the immutable entry in the set. Works a treat.
@Pystro
@Pystro 10 ай бұрын
I think the fastest strategy here would be to generate all the valid shapes, but completely *eliminate the lookup* if you've seen it before. Instead you calculate the number of different ways there are to build up that shape. This turns the problem from "Find every possible shape, and count the first unique instance fully but every consecutive instance don't count." into "find every instance, and count them all in a way so that all variations of one unique shape sum up to 1". Furthermore, you'd be able to factor in the symmetry of the shape. (For example, the "L" shape in 2D that he uses as an example can appear in 8 *different* orientations, so you divide by another factor of 8, while a 2x2 square in 2D can only be in 1 unique orientation so you divide by an additional factor of 1.) The easiest way I can see to count every instance is to add cubes one at a time, because then the question "how many ways are there to build this shape" becomes "in how many orders can you add these cubes without having disconnected clusters along the way?" Then again, i just realized that if you do the "populate a grid with ones and zeroes and check if connected" strategy, you'd be answering the question of "how many shifts are there to fit this shape into the grid" which is even easier since you merely have to count the amount of zero-padding along the 3 axes and multiply those numbers. It would also be great for multi-threading because it's what's called "embarrassingly parallelizable". The problem with the byte array representation is that the number of representations for 17 cubes would be a 17*17*17 byte array, i.e. 2^(17^3)=10^1479 candidate shapes, of which only a tiny fraction are actually valid. It might be smart to treat the trivial 1*1*N and the still rather simple 1*2*(N-1) cases separately, so that your representations can shrink to (N-2)^3. Or impose an ordering on the dimensions to make it N*(N/2)*(N/3); Or probably even both for (N-2)*(N/2)*(N/3)=10^217. Or don't place your 1's into a cube shape, but an N^3 octahedral space (10^291 candidate shapes). The best I can think of is to use an octahedron truncated to (N-2)*(N/2)*(N/3) in order to combine the space that is actually reachable plus the axis ordering.
@dano6187
@dano6187 10 ай бұрын
Several comments suggest using the hash to define the canonical rotation for each shape. However, that requires producing all the rotations and rotating the shape is perhaps the most expensive operation here. Instead, we could define the canonical rotation in a way that makes it easy to decide if adding a cube would produce a shape that is not in the canonical form and if there exists a different path to reach that shape without requiring a rotation. For most shapes there will be leaf cubes that have only a single connection. Each of these leaves define alternate paths to reach the shape. If any of these alternate paths was preferred over the current path we could ignore the current path with the assurance that the preferred path will eventually be discovered. Leafs cubes connect to branches and branches eventually connect to core cubes that form loops. The set of leaf cubes plus core cubes not directly connecting branches or leaves represents every possible path to reach the current shape from a shape with 1 less cube. If we can identify a single best possible path to reach this shape then it would not be necessary to save this shape for future tests for duplicates since only the addition of the cube from the best possible path would be counted and further expanded. Only the shapes that have multiple equivalent best possible paths would need to be tested for duplication. If is is possible to always define a unique best possible path we could eliminate all duplicate testing and therefore the memory limitation on the maximum size. This would also facilitate parallelization. However, that is a big “if”. Rather than iterative deepening where all shapes of a size are saved before searching for shapes of the next larger size, it may be better to generate the smaller shapes on the fly. A benefit in addition to saving storage for the generated shapes is that complex data defining the shapes could be built iteratively as each cube is added.
@1337GameDev
@1337GameDev 10 ай бұрын
13:51 - Treat them like 1d, 2d and 3d shapes. They are VOXELS. So just use binary to encode the shapes, 1 or 0 for each voxel in the 3x3 cube. Then.... just perform a standardization on the data to use a gravity centric method to force 0,0,0 to be the gravity center. Then just "pull" on the center of mass of the voxel towards this, and choose minor edge cases for "equal" weighted edges/balanced voxel shapes (EG: [010][010][010] looking across the shape as a series of rows from top to bottom, front to back). Then you can converge multiple rotations into a standard rotation, then just hash all of them. WORKS well for parallelization too -- first process all normalized rotations, then process sorting based on the converged shapes and have a base ID given to each distinct shape and original input and record that when sorting. Then..... convert that code into CG and run on a GPU -- as this is well suited for matrix representation and operations. Essentially you have 2000+ gpu cores being able to execute this code on a 3080....
@SStefanovski
@SStefanovski 10 ай бұрын
An encoding that accounts for rotation might be to just encode the shape from the 8 corners of the bounding box and add them all together, making the rotation problem an addition problem which is faster for the CPU to compute. If you label each cell of the 3D coordinate space with a prime number, starting from one corner and ending in the opposite corner then you could represent a shape as the product of the primes. If you do the same but start counting the primes from the 8 different corners and add all the products together you get the key for that shape to lookup in the hash table. Not sure if it would work in practice though.
@flake8382
@flake8382 10 ай бұрын
Love this problem, and I further love that it seems solvable for 3 dimensions. - solvable, as in regardless of how many cubes you use, it can be solved in some sort of consistent and logical timing pattern, like N, or N^2 or something like that. Solved as in a completed proof exists to confirm the algorithm never misses anything or miscategorises, and is as efficient and low weight as possible.
@mctuble
@mctuble 10 ай бұрын
What I like is his way of making it interactive. Follow along if you want. Share your results. Someone's bound to have a brilliant idea.
@MesaCoast
@MesaCoast 10 ай бұрын
Easy way to reduce comparisons: all you need is a way to choose 1 "preferred" orientation of a given shape out of its possible rotations, and only check that rotation against your list. One easy way to do this is to read each possible rotation of the array as an integer (since it's an array of bits, it's already an integer, you're just reading it differently), and then choose the greatest integer. Then, check if you have that shape in your array already, if not, then add it. You can also order your shape storage by this integer to allow for binary searches, reducing the comparisons even further.
@KipIngram
@KipIngram Ай бұрын
Ok, here is what I will try if I have a go at this. I want to shove my block of cubes down as close to the origin as I can, still having all the cubes be in the first octant. There are six faces that can be against the y/z plane - I choose based on bringing as much mass as possible close to that plane. Then there are four ways to orient around the x axis - I'll choose that to bring as much mass as possible closest to the x/y plane. Then hash it. The main rub I see here is breaking ties (when multiple orientations will put the same maximum mass close to whichever plane I'm considering). I'll have to give that some further thought.
@jamesalewis
@jamesalewis 10 ай бұрын
My initial inclination on the optimization front is to turn the 3D array into a big int in all its possible rotations and store the lowest value, then look up that big int in the list of found shapes. Now, this would turn into a 5k-bit datum for a 17x17x17 array of bits (4913), but in reality it would be a triangular array for a 2D (n*(n+1)/2 cells), and a right tetrahedral array for the 3D case (n*(n+1)/2*(n+2)/6 possible cells), which cuts the number of cells into half and a third respectively, so that huge int for a 17-cube array can be encoded into less than 1kb (969 bits, precisely), which could work in a lookup. You may like to take the SHA-2 of the 1kb and turn it into a 256-bit datum, just counting on no collisions. That SHA-2 hash can be used as indexes to lookup the actual shape for double-checking. If you want to get it up to 32, for a nice power-of-two, your size would be just under 6k (5,984), about 5.5x fewer cells.
@Dudleymiddleton
@Dudleymiddleton 10 ай бұрын
Back in 1980 (I think - it was a while ago!) My mum was a primary school teacher and she brought home a bbc micro with cub screen from the school over the summer hols to learn how to use it - I found this app called "Build" ( I think!) and it created this block lattice based structures navigated by cursor keys, I got hooked on it, I was about 13 -14.
@pierreabbat6157
@pierreabbat6157 10 ай бұрын
A couple of suggestions: *Write it in Julia or Haskell. Both are compiled to machine code, hence fast to run, and also fast to develop in. *Categorize the polycubes by how many of each color of cube they cover in something like a checkerboard.
@coderider3022
@coderider3022 10 ай бұрын
I made a simple Rubix cube solver in python and hit these issues, needed a few colour coded xls files to understand it first. Glad the dr covered all my approaches. My goal was to learn python and see TTD options. (I’m c# / azure dev )
@GuentherJongeling-hu6oe
@GuentherJongeling-hu6oe 10 ай бұрын
A quite obvious idea to make it faster (maybe) would be to store all rotations of the cube thingies as individual items of the hash table, that way you don't have to compare all rotations of new cube combinations to check whether the combination exists somewhere. This is of course at the expense of memory.
@letsgocamping88
@letsgocamping88 10 ай бұрын
But then how do you order the rotations within the table to get the same hash?
@MNbenMN
@MNbenMN 10 ай бұрын
The memory expense could not be much more understated. I think the memory for hashtable for the 16-part polycube shapes would end up larger than 60TB with that approach. That's a lot of RAM!
@BooBaddyBig
@BooBaddyBig 10 ай бұрын
To check for uniqueness, you only have to check the lowest (or highest) rotated hash number for any given shape, so storing them all in the hash table is pointless. Because they're all equivalent.
@itellyouforfree7238
@itellyouforfree7238 10 ай бұрын
This is EXACTLY what the interviewer suggested in his question. Did you not watch the video?
@matsv201
@matsv201 10 ай бұрын
You don't even need to do it. Just make sure that the COG is in one direction, and you know you have the only possible rotation. Not even a need to do the search.
@bertiesmith3021
@bertiesmith3021 10 ай бұрын
If you store your cube as a bit array you can treat then order the various rotations and translations. Pick the smallest number and then that is the single unique description of the shape. This can be trivially hashed. And also the bounding box for n doesnt need to be be n by n by n - it only needs to be n x n/2+1 x n/4+1, since a valid solution which doesn’t fit will have a rotation or translation which does.
@cheako91155
@cheako91155 10 ай бұрын
1. An array of bool is a big integer. 2. Whatever value is smallest is the only representation of this shape that you operate on. 3. When during a rotation the value increases or would increase, the rest of that rotation can be skipped. * Because you would never operate on a larger version of the shape. 4. After a rotation that shape's value must be smaller, then the previous version can be forgotten. 5. You don't need to use smallest value if some other scoring function would work better.
@cheako91155
@cheako91155 10 ай бұрын
I think if you did this you can even prevent yourself from creating "large" rotations in the first place, can't you slice the work area along a diagonal plane? That is a true/1 in half of the locations(in a cube of cubes) would seemingly to me have a smaller rotation.
@abhaygugale3889
@abhaygugale3889 10 ай бұрын
Check the description of this video,it's fun
@desertfish74
@desertfish74 10 ай бұрын
How much wood would a woodchuck chuck?
@Carambal81
@Carambal81 10 ай бұрын
I C 😄
@Vixikats
@Vixikats 10 ай бұрын
After being presented with the problem I immediately thought that whether the binary representation of the data could tell you about its orientation and I'm pretty sure it can, at least for 2 dimensions. For example, given the pattern for a 2-dimensional square: 1 0 1 1 1 1 1 0 0 We can produce 4 different rotations, 2 90 degree rotations and a 180 degree rotation. If we take the row by row binary representation of all of the flips, we get this: Normal: 101 111 100 90⁰ Right: 111 010 011 180⁰: 001 111 101 90⁰ Left: 001 111 101 If you take a look, you can notice something immediately apparent. Both the 180 degree and 90 degree rotations are exact binary mirrors of their normal and opposing 90 degree counterpart. They are exactly the same except the order of the bits is reversed, which makes sense. Which means we really only have 2 different binary representations for the pattern... or do we? No- we actually do. At first I was looking through some samples and thought that there was some clever bit rotations and at first it seemed like I was correct, for example, look at the example above: 101 111 100 111 010 011 We notice the pattern I was initially looking for if we just take a look at how the bits are clustered together. This actually turns out to be simply a rotation of bits of the main configuration. The 90 degree rotation operation is nothing more than a left or right bit rotation of some degree. In this case, you can get this through EITHER (Normal ROTL 5) or (Normal ROTR 4). ***HOWEVER*** It turned out that this ability to simply rotate bits was just a fluke and that it doesn't actually hold true for all configurations. So it appears as though we really do have 2 correlated, yet still independent bit strings that we can mash together in some fashion and throw into a hash function that a hash table can use. One important thing that you have to do is that you cannot have any padding bits, so no empty rows or columns before or after the configuration. This turns out to be very simple, honestly because it's very easy to get the absolute bounds of a configuration of cubes and we can simply use that to offset and normalize everything and ensure that we're only reading relevant bits. I HAVE YET to see how this works in 3 dimensions and what data I might find, but I will continue to research and test and see whether this remains true. But maybe this'll be an interesting jumping off point that someone else might be able to use while I'm working on my own things. I am assuming that given the 24 total rotations compared to the square's 4, we're going to end up with either 6 or 12 total bit configurations that can describe all possible rotational transformations of a cube.
@JATmatic
@JATmatic 9 ай бұрын
The situation is that the computation eats all the memory we give to it. If we give less than this it takes forever to compute. I'm on the C++ version trying to improve the code, patches are welcome. ;) N=16 takes estimated 1728 GB of memory as of now and output on disk is estimated 2604.8 GB. For N=17 the output on disk is estimated to 20838 GB or over 20 Terabytes.... This actually knowing what the individual polycubes are and not just the enumeration of them that has been done for N=18
@RodrigoCespedesDaza
@RodrigoCespedesDaza 10 ай бұрын
I think the key here might be creating a path finding algorithm where you touch all the cubes in an arbitrary shape in such a way that the entry point to the object is deterministic. So if you have a set of rules (like an automata) that describes how to traverse every cube in the object, you could traverse it from every starting point and have a set of unit vectors describing each step in the traversal; next step would be a sorting algorithm for each set of vectors based on rules that are deterministic (like, (1) you always go to the right if you can, (2) you only turn around if there's no unexplored cube available from the point you're standing, etc) If you can sort the sets of vectors deterministically, then you can determine an entry point to the object deterministically, regardless of the object's orientation. If you have an entry point, then all you need to do is hash the set of vectors that traverses the object from that starting point. Given this approach, the key is that your traversal algorithm must generate all possible sets of vectors for traversing the object in fewer iterations than the number of ways to rotate the object 😓. It feels like generating a set of rules similar to traversing a tree data structure will yield the set of vectors quickest this approach might have a best case scenario Ologn time complexity to deterministically generate a hash So looking up a deterministically generated hash might get you to a faster algorithm, I think this approach would reach a local maxima once N === "the number of possible rotations of an object", as long as N is lower than that number, this has possibility of being more performant
@happygimp0
@happygimp0 9 ай бұрын
If i understand you correctly. Create a small list for every rotation of a object (moved so that at least one cube has the x coordinate of zero, at least one cube has the y coordinate on 0 and one cube has the z coordinate to 0. Maybe this is a single cube that has more than a single coordinate at 0). And then use the maximum or minimum value of this list and only use this rotation for comparison and storing? It would get even better when you could only build every structure once and don't need to check if you have it already created before.
@sdgvlkjnasdlfkjawlk
@sdgvlkjnasdlfkjawlk 10 ай бұрын
Because the number of polycubes explodes to ~60 Billion at only n=16 improving the lookup when checking for duplicates seems more important than any other operation. - My primary idea is rather than creating a new polycube then checking whether it has been seen before we "look backwards." The new algorithm would be to create a new polycube of size n, then generate all polycubes of size n-1 that can be created by removing one block, and remove those from the queue. I don't know how well this could be multi-threaded though. - There are about half as many polycubes if you combine reflections. Making use of this fact would shrink the lookup time and memory requirements at the cost of more work to generate the storage representation of each cube. - Many others have already said it: The best rotation (and reflection) canonical representation I can think of is to test all of them and choose the bit string with the lowest value. This would allow you to create the full hash as n, the bounding box dimensions, and then the bit string (with leading zeroes deleted to save space) plus a count of unique reflections.
@user-pw5do6tu7i
@user-pw5do6tu7i 10 ай бұрын
Instead of generating candidates by extending the shape off every face, I think we could limit it by applying some rules. Only generate Right, Back, Down (+x, +z, -y). So a tetris S peice never checks for a tetris Z peice. And I think ontop of that you could only branch off the last added cube. The recursion will build them in a way slightly counter intuitive to us. But they should all be checked
@user-pw5do6tu7i
@user-pw5do6tu7i 10 ай бұрын
Instead of thinking geometrically, with a growth factor of ^n, think of it like a breadth first search.
@theirishpizzaguy6663
@theirishpizzaguy6663 10 ай бұрын
he is right
@haxi52
@haxi52 10 ай бұрын
Generate a cube of ints, size = max(x,y,z). Start in the middle and spiral out incrementing. These are weights. Multiply the weights by the shape and all them all up. Rotate the shape and do it again. The rotation that gives min is the one you hash. If the shape is symmetrical you may get duplicate mins, which kinda proves it works.
@devttyUSB0
@devttyUSB0 10 ай бұрын
Epic video description :)
@elraviv
@elraviv 10 ай бұрын
to solve the orientation problem. let's say you had a function that decide for each shape what is its' representative orientation. that way you would only need to compare and save one orientation. here is such a function: take the 24 orientations of the 3D array of 0s & 1s and treats each one of them as one long number. the representative orientation is the one that gives you the lowest number, and there should be only one. bonus fact now you can order the shapes by this value looking up a number in an ordered array is O(log(n)) true you may have to deal with a few thousands bit long numbers, but you could use run length code, as you did to store the shapes.
@RobertSandell
@RobertSandell 10 ай бұрын
My first thought of how to do the "have I seen this shape before" lookup was to use a binary tree, but that idea is not fully thought through yet. But by searching the multi array the same way each time in sequence that position would be a level in the tree and if it is a cube there you for example go down to the left and if not you go right and if there is no child on that side it means you haven't seen that combination before.
@jounik
@jounik 10 ай бұрын
One easy way to account for rotations would be to consider them an equivalence class and provide a canonization. I'm thinking something like having a well-defined comparator operation for the RLL encoded pattern, calculating the pattern for all the 24 rotations and only storing and hashing the smallest one.
@Ceelvain
@Ceelvain 10 ай бұрын
This video definitely has a Matt Parker vibe to it. Giving it a go and seeing what happens. The only difference is that Mike focuses on the programming side, while Matt would focus on the maths.
@daveduncan8090
@daveduncan8090 10 ай бұрын
Why worry about repeated shapes first? Calculate the number of times each shape of N▪︎ can fit in the cube after finding it. If there's a pattern there, you can use that to reduce the total to uniqueness and just first determine the maximum number of 16,17,18 piece shapes can fit.
@gianluca.g
@gianluca.g 10 ай бұрын
The topology of the shape is rotation and specular reflection invariant. I don't have an encoding in mind but my gut feeling is that we should encode the topology of the shape and hash that to see if there is a match. This might be a graph problem. Here's an idea, I just had that while writing the comment (rubber duck effect?): The set of possible places to extend the shape can be pruned out by not considering places that will result in a mirrored/rotated extended shape. This doesn't avoid the problem of rotating/mirroring the shape but at least the test is perfomed on just the current state of the shape vs the extension places.
@LearnedSome
@LearnedSome 10 ай бұрын
The problem described in this video is a 3D modeling problem. Instead of numpy, you should create a boundary representation for each shape. And then, if you created a complete shape descriptor, you could discard duplicates by making it a unique index in your target database table.
@davidwipperfurth8465
@davidwipperfurth8465 10 ай бұрын
When you generate a shape for the first time, generate each rotation, then put all the rotations together (add them or put them all side by side using some predictive orientation rules, whatever is computationally easy and distinct) and hash the group of them. Now you have a hash that represents every rotation of that object, use that hash as the key in a dictionary. When you generate a new object, make it's rotations, hash the group of rotations and check it against the dictionary of group hashes. With this method, you do still have to compute rotations, but you only have to compute rotations once; you don't have to check each rotation against the list. Another approach would be to compute some fast characteristic stats about a shape when you generate it (like, how many boxes have 6 neighbors, 5 neighbors, etc.) and use those stats to make an elaborate sorting tree. You know your shape can't look like other shapes if they don't share similar characteristics, so by following the tree branch for "this shape has *1* box with 6 neighbors" you immediately eliminate most of the searchable space, and every branch you follow from then on further reduces the searchable space in a roughly exponential fashion. It should be easy to calculate stats like neighbor count if you are starting from a lower order shape and adding one box, as you just take the stats from that shape and make a small tweak to reflect the one box you added. You could probably even figure out how to use the stats as a hash so you aren't even walking a tree or writing tree code.
@aloluk
@aloluk 10 ай бұрын
I'm early on in the comparison part, but i wonder if you compare bit counts given your storage. Its very efficient at least in C/C++ to count bits. Or maybe hashes per number of set bits. If you get feedback submissions, i'd love to see another video evaluating those.
@carlociarrocchi2793
@carlociarrocchi2793 10 ай бұрын
It is similar to finding all of the possible molecular structures (isomers) from a molecular formula in chemistry. That can quickly turn from a simple exercise to a nightmare (even without accounting for enantiomers and diastereoisomers)
@zxuiji
@zxuiji 10 ай бұрын
7:24, paused here for a moment, to me the math seems a bit simple, I don't remember what it was but I remember there being some sort of formula for generating the number of unique sequences that can get given N numbers - so for example 2 can generate 2, XY & YX, 3 can generate 6, XYZ, XZY, YXZ, YZX, ZXY & ZYX - I would use that on the result of width * height * depth, at worst would be only a few % out off from the actual number. **Edit:** btw the rotations could be dealt with rather simply, find the longest length of the shape and set that as the central axis, next you set the shortest radius of the remaining as another axis along that axis (for shapes like L where it needs to be flipped one way just use the longest hypotenuse between the shortest radius and the each half of the central axis lengths to determine which way is "up"), then finally you should have 1 orientation to check against. As for the check of "had this before?" just have multiple arrays, one for each cube count, you can use that count to quickly get to just the shapes of the same cube count (which should be a small number) and check against those. For optimising the array allocation you can use the formula that I mentioned above to pre-allocate enough slots for the majority and just generate shapes from cube count, incrementing until you hit width * height * depth, setting each new array as an offset of the original
@GamingKing-jo9py
@GamingKing-jo9py 8 ай бұрын
i had a similar problem where i had to store a hexagon block surface, so i :generated every possible list of vertex heights for 6 vertices ,removed some i didn't want (consecutive heights can only differ by one, which which made it so vertex heights < 4) ,then for each i :generated all possible rotations and flips ,sorted ,picked first (this eliminated symmetries) ,used that as a hash for a set (actually i used a dict) .so i got a list of all types of blocks i cared about .i actually did this for a friend who was trying to make a game
@GamingKing-jo9py
@GamingKing-jo9py 8 ай бұрын
with no numpy my python cracked at max height of 6 so finding out that they can only be < 3 after moving vertices to the floor help)
@GamingKing-jo9py
@GamingKing-jo9py 8 ай бұрын
the sort was an arbitrary choice as the canoncial form
@alikaperdue
@alikaperdue 10 ай бұрын
More efficient? - always grow from a placed block only into the positive quadrants. Anything that grows to the left must be a rearrangement of the object that grows to the right. - imagine the smallest box around the piece. When creating a new piece this box has sides a
@Txoka
@Txoka 10 ай бұрын
A way of encoding it such that the same shape rotated ends up being an equal representation is to just calculate the center of mass of the shape and select the rotation that minimizes it (or maximizes it, or moves it to whatever corner you want, it doesnt matter as long as its the same for every piece). Should be pretty straightforward. Of course first you need to match x=x y=y and z=z for the cut container of the shape. So you only have to rotate it if there is multiple ways of doing that to start with.
@paulpinecone2464
@paulpinecone2464 10 ай бұрын
The most efficient parallel AI algorithm for determining an optimal representation is to post the problem on the internet. This is a search technique called NerdLust. The most efficient way to run a 10 year computation is to go to sleep for a decade and then just run it on your toaster.
@SuppaflyZSM
@SuppaflyZSM 10 ай бұрын
I feel like there is probably some way to use vector math for this that mathematicians would immediately recognize and know how to handle.
@gougz99
@gougz99 10 ай бұрын
Good idea! Maybe create all vectors starting from the center of each cube to the center of all the other cubes, then sum them or something.... and multiply by matrices for rotations
@JoQeZzZ
@JoQeZzZ 10 ай бұрын
I don't think so. It doesn't feel like an affine problem for me, since there can be holes like a u shape. You could mangle it into a vector problem by sort of serialising the 3D space, but that's just a transformation you can do after you've found a solution. The only smart thing you can probably do with vectors is an easy way to rotate and/or mirror found solutions through matrix multiplication, but that's only syntactically clever and wouldn't speed up the process computationally
@freee8838
@freee8838 10 ай бұрын
Gpus...
@samanthaqiu3416
@samanthaqiu3416 10 ай бұрын
lattice math is inherently hard, otherwise it wouldn't be a credible cryptographic scheme
@DerClaudius
@DerClaudius 10 ай бұрын
Rotation independent hash: - generate all rotations - shift towards 0,0,0 as much as possible - interpret array as binary number and take minimum number as representative for that shape - hash that
@tdug1991
@tdug1991 10 ай бұрын
My first thought would be to have each square/cube represent a power of two, and the hash value of each shape would be an integer, up to n^3 bits. You'd store/lookup the minimum [or maximum] of each of the 24 rotations. You still need to do all rotations on each shape, but you only need to compare them to a hash table that you could binary search, if constructed as an ordered list. I think this only saves a CPU/memory factor of 24. The hashes would also be unique per-generation, so you'd have to use the max grid size for the smaller shapes, or at the very least, recompute them for each generation on the new grid. For n=16, this would be 512 bytes per hash, if stored with perfect efficiency. There may be a more efficient way to determine a "primary" rotation of a set of cubes, or perhaps a way to generate fewer duplicates in the first place.
@nytehauq
@nytehauq 10 ай бұрын
Symmetry-aware sparse voxel directed acyclic graphs might help for compression/representation of really large configurations of cubes.
@mjlavall
@mjlavall 10 ай бұрын
Update your encoding method to include all rotations of the shape in a well ordered way, so as soon as you add a shape to your hash set, the hash is calculated from all possible rotations of the shape. The next time you encounter a shape, no matter the orientation, its encoding will match the one already in the list. You will have to calculate all rotations for each shape, but only have to do one lookup.
@Akronymus_
@Akronymus_ 10 ай бұрын
Has been mentioned in the video. And wouldn't really work, as it explodes the memory footprint and ruin the cache.
@matsv201
@matsv201 10 ай бұрын
It's way simpler to just have all objects in the same orientations and that is possible for most objects with a xyz and a COG xyz order. This way most objects can be discarded even prior to search.
Discussing PDF@30 Years Old - Computerphile
14:33
Computerphile
Рет қаралды 94 М.
Cracking Enigma in 2021 - Computerphile
21:20
Computerphile
Рет қаралды 2,4 МЛН
Не пей газировку у мамы в машине
00:28
Даша Боровик
Рет қаралды 10 МЛН
Do you have a friend like this? 🤣#shorts
00:12
dednahype
Рет қаралды 11 МЛН
The Game of Set (and some variations) - Numberphile
22:34
Numberphile
Рет қаралды 174 М.
Floating Point Numbers (Part1: Fp vs Fixed) - Computerphile
15:41
Computerphile
Рет қаралды 156 М.
What is an Interrupt?? | Interrupt Handling in OS
4:56
In and around Computer
Рет қаралды 1,1 М.
What If You Put Nutella In a 3d Printer?
7:35
The Action Lab
Рет қаралды 227 М.
Discovery of the Aperiodic Monotile - Numberphile
31:04
Numberphile
Рет қаралды 195 М.
Kernelless Kernel Programming (eBPF) - Computerphile
19:12
Computerphile
Рет қаралды 70 М.
L Systems : Creating Plants from Simple Rules - Computerphile
15:16
Computerphile
Рет қаралды 43 М.
Ethernet (50th Birthday) - Computerphile
26:18
Computerphile
Рет қаралды 127 М.