Mike's Cube Code - Computerphile

Рет қаралды 121,229

Күн бұрын

Пікірлер: 646

@ralphzajac8397 Жыл бұрын

"The Programmers' Credo: We do these things not because they are easy, but because we thought they were going to be easy,"

@aceman0000099 Жыл бұрын

*creed

@DavidLindes Жыл бұрын

@@aceman0000099 they're synonyms... so... user's choice.

@aceman0000099 Жыл бұрын

@@DavidLindes credo is the word for creed in other languages, not English

@QuantumJump451 Жыл бұрын

@@aceman0000099really? It's in the dictionary, pretty sure they're both English words

@alibarznji2000 4 ай бұрын

Hahahahahaha, Amen

@bmitchem457 Жыл бұрын

"Starting programming it thinking it'd be quite trivial" Famous last words

@pdrg Жыл бұрын

Laughs in RegEx

@carlociarrocchi2793 Жыл бұрын

That's what happens with all of my projects.

@gilly_the_fish Жыл бұрын

I call them "famous first words", because that's how many of my projects have started. "Oh this should be trivial" *disappears for 3 weeks* "Guess I was wrong"

@snack711 Жыл бұрын

happens to me everytime, starting to slowly get more realistic about it

@custard131 Жыл бұрын

cant wait for the "Someone improved my code by 40,832,277,770%" follow up video

@letsgocamping88 Жыл бұрын

I'm looking forward to the coding train version. And the chaos that ensues!

@knssoftware6018 Жыл бұрын

Especially if it is Mat Parker who nails it.

@downstream0114 Жыл бұрын

@@knssoftware6018 It's gonna be a Parker optimization, 40,832,277,770% slower.

@JoQeZzZ Жыл бұрын

He at least used numpy, so that's already about a 1000x seedup from how matt parker would've done it.

@k0pstl939 Жыл бұрын

@@JoQeZzZimport math as maths

@aikumaDK Жыл бұрын

"should be easy enough to code up" "the record is for n=16" "okay, it's gonna be borderline impossible"

@adamsbja Жыл бұрын

I love these sort of ideas that are easy to grasp, get complicated, but get complicated in an understandable way. As soon as he starts laying out his method you can see why it explodes.

@error.418 Жыл бұрын

love that the record is actually n=18 from 2006 "Counting d-Dimensional Polycubes and Nonrectangular Planar Polyominoes" by Gadi Aleksandrowicz & Gill Barequet (2006)

@tommydowning3481 Жыл бұрын

What I like about Mike's videos is not only the content, but the genuine enthusiasm he brings to the table. It's very enjoyable and relatable. Great stuff.

@amycoscyrus5740 Жыл бұрын

He really shows that he loves his job and the work with numberphile

@whenindoubt Жыл бұрын

This is a graph-coloring problem. The transform-invariant representation of a polycube is a cube-adjacency graph. Each polycube record also stores a mapping from (axis, direction) to a "color" (index), and each adjacency edge is "colored" by the (axis, direction) tuple. Two polycubes are identical iff their mappings can be reconciled such that they can be colored equivalently. (Corrections for chirality can be made by imposing ordering restrictions on the color mapping.) I don't think there is a characteristic graph encoding that will produce a unique hash value (this is what makes the problem interesting). But an approximation that will converge to an acceptably-small set of candidates is the number of vertices of each degree and number of edges of each color, sorted.

@juancarlosvasquez1865 Жыл бұрын

Great comment. As the latest fields medals on combinatorics, it is always a representation bottleneck problem. Like Feynman Lagrangian-Quantum Mechanics based in least action principle against the Schrodinger Equation Hamiltonian approach.

@DavidBerglund Жыл бұрын

Going to go a thumbs up on this because it sounds plausible but having basically no math background, this could just as well be that I'm being trolled on the internet again... 😂 But no this sounds like such an interesting field to think about. So, combinatorics, is that it?

@jamesrivettcarnac Жыл бұрын

What I came for

@alegian7934 Жыл бұрын

so you're saying this boils down to graph isomorphism?

@CatoSierraWasTaken Жыл бұрын

How do you represent chirality using graph coloring?

@michaelpound9891 Жыл бұрын

I can’t express to everyone how happy all these fantastic comments make me :D Get coding people we’ve got a record to set!

@sephirapple7317 Жыл бұрын

@michaelpound9891 Hey, I don't know if you've already considered this, but computing 3d transformations can often be easier by using 4 coordinates instead of the traditional x y z system. I don't know if this would work in this case, maybe you've already tried using quaternions, but if you haven't it may be a helpful tool here, since they often make computing/encoding 3d rotations much easier. Hope this helps somehow, I don't know enough about the problem to say how it might be implemented here but since quaternions are often helpful for rotation I thought it might be worth mentioning in the hopes it could be useful to you, best of luck with it either way! 🙂🍀

@PedroContipelli2 Жыл бұрын

Apparently the record will be harder than previously thought! Current record is actually n = 18, computed in "Counting d-Dimensional Polycubes and Nonrectangular Planar Polyominoes" by Gadi Aleksandrowicz & Gill Barequet (2006). Wikipedia page has been edited! n = 19 next?

@error.418 Жыл бұрын

we're not your personal coding army

@aloluk Жыл бұрын

I would but I dont have spare time now a days.

@napukapu Жыл бұрын

You are the GOAT lecturer of Computerphile. I subscribe for Mike's wisdom.

@YingwuUsagiri Жыл бұрын

Mike is just so relatable. Way too often has a colleague said something and then me going hmmmmm I can do that right? Should take fifteen minutes. And then it took six hours, many pages of research and a ginormous headache and it will absolutely be a one trick pony solution that will never ever see the light of day again outside of this one usage.

@roberttalada5196 Жыл бұрын

That’s maths for you

@AlphaPhoenixChannel Жыл бұрын

Man watching this immediately before bed was a bad idea… I WAS planning to go to sleep, but now I’m laying in front of a sheet of 11x17 agonizing about chirality

@GeorgeBratley Жыл бұрын

Please do a follow up looking at some of the techniques people used to make this more efficient!

@goat5249 Жыл бұрын

This video has been up for over an hour... So has the world record been beaten yet?

@shouldb.studying4670 Жыл бұрын

@Alex-ck4in Жыл бұрын

I'm tempted to take a stab (and fail 😂)

@BenJamin-rt7ui Жыл бұрын

Turned on my radio today. Imagine my surprise to hear Dr Pound enlightening the listeners on the subject of AI.

@michaelwilson5742 Жыл бұрын

Not as big as ours, as our press office gave it the same level of coverage as when he was on bbc click. Which is zero. 😮

@cmelonwheels Жыл бұрын

I never know whether to feel reassured or concerned that the struggle of "I thought this would take 15 minutes but actually it was 8+ hours" continues all the way up through every level of programming experience.....

@CTimmerman Жыл бұрын

That's mainly due to not knowing every detail about the libraries you're using, and the lack of people doing automated regression tests with 100% coverage. A new Pillow bug broke one fuzzing instance of my steganography program for example, but i don't care enough to monitor that after linking my test in their issue tracker.

@BooBaddyBig Жыл бұрын

One interesting thing would be to find the center of mass of the shape and make the coordinates from there. You have to come up with a hash that can be quickly updated as the centre of mass moves around. And you could use the centre of mass as part of the hash, if the centre of mass can't line up, then it's a different object already. For example if the CofM is at (1.2, 3) but another is at (1.5, 4) then you know they cannot match because no square rotations or integer translations can move 1.5 to 1.2.

@GuentherJongeling-hu6oe Жыл бұрын

Another idea for the notation I had is from some ideas of organical chemistry, where there is a naming scheme which uses the longest chain of Carbon atoms in a molecule as a base, and has all side-chains named in a specific way too. For relatively simple structures like these cube structures are, there is only one way to name it. This would require computation of the chain lengths, but could maybe account for rotation. This is quite a bit harder in 3D probably, but maybe with some modifications it might work.

@pettere8429 Жыл бұрын

Use the largest x-by-y-by-z block in place of the longest chain and then number the possible attachment points for features in a systematic way and it should work.

@Ylyrra Жыл бұрын

The issue with normalisation like that is whether calculating the normal form takes longer than just doing the fairly inexpensive rotation and hashing. In aggregate the rotation and hashing is expensive, but individually it isn't giving you a great deal of budget to find an improvement in. I'm not saying this approach is wrong, it's where I'd start looking too, it's just that the solution needs to be REALLY simple and fast to be an improvement.

@kristiankjaergaard Жыл бұрын

Geometric mean followed by principal component analysis?

@segganew Жыл бұрын

Heh I thought of something like SMILES too

@bogdannikolakov2159 Жыл бұрын

It’s been four hours now, so someone has already improved the algorithm by 100x, CUDA’d the shit out of it and ran it on a 2000 GPU server farm

@NoomerLoL Жыл бұрын

Perhaps you could align the 3D object by its dominant orientation (similar to how it's done for Scale-invariant feature transformations) and then store the rotation invariant hash of your encoding, that way you only have to look up one hash per object. No idea how efficient this would be, though.

@rich1051414 Жыл бұрын

A foolproof way to align any shapes rotations to a deterministic orientation that requires less computation than actually rotating all the different possible rotations would be the way. Otherwise, you could just use a standard unique hash function and simply assign the min/max hash of all of those rotations, for the duplicate prevention.

@brunobastos5533 Жыл бұрын

better than say is doing it , he got the code in git or can start a new one , is great exercise

@oasntet Жыл бұрын

Hash lookups aren't actually all that expensive, and it's a finite number of them regardless of population size. So long as the lookup remains constant time (and there were cases in Python where this stops being true, but I can't speak to current Python or numpy) then the only difference is memory usage. That's not nothing, but at the n=12 polycube it's probably not a major concern. I'm betting either the hash lookup needs a tiny tweak to return it to constant time, or some other part of the algorithm is the real issue.

@rsv9999 Жыл бұрын

then you can use this knowledge of all 24 rotations to store if the object is equivalent on each side, reducing the amount of times you add a cube to both sides of the same shape, needing to go through all 24 iterations more. potentially this could even eliminate all duplicate objects, although I doubt that's true but would take too much time for me to think about.

@NicholasThompson1982 Жыл бұрын

This sound similar to my idea of representing the shape’s 0’s and 1’s as a binary number, apply all rotations and pick the largest/smallest one as the primary hashed version. In fact… why even hash it if it’s just a very large integer? Could probably even avoid casting to an int if you just binary sort 24 long-binary numbers?

@landsgevaer Жыл бұрын

If anybody wonders, this is OEIS sequence A000162. YT doesn't like me posting the link, but that should suffice to find it.

@squishmastah4682 Жыл бұрын

When you're randomly browsing KZbin and Mike drops some code in your eyeballs...

@damicapra94 Жыл бұрын

That's the good stuff

@garethdean6382 Жыл бұрын

Having done organic chemistry my mind automatically went to carbon compounds and some code I once idly made to calculate possible alkanes. If each cube is represented as 'c' and a face as 'f' you can write a formula for a possible cube. The first one-cube monomino is CF6, the two-cube domino is C2F10 (F5C-CF5) and so on. Deriving a formula is simple enough. For repetitions you only need check identical formulae. (A block of four cubes is not the same as a line of four cubes though an L and a line of four cubes would be.) There's more to it but boy am I too lazy now to resurrect a decade old project. Kudos to you, math-man!

@johnrozewicki7132 Жыл бұрын

I worked on this problem for protein comparison as part of my academic research. You can store these in a rotationally invariant way by encoding subsequent cubes in reference to an "origin" cube and not in global coord space.

@bastscho Жыл бұрын

However this does not work unless i am missing something. Suppose we have 🟥⬜ where the red cube is the origin block. Obviously the white block is directly connected to the redblock with distance 1. Thus ⬜ 🟥 Would immediately be detected as an equivalent to the first. But now we build further to 🟥⬜⬜, where the third block has distance two from the origin in the east direction. However we can also build ⬜🟥⬜, with two blocks of distance one. Now both of these are equivalent, as we do not color the blocks. But your system may not detect this.

@tonyd6853 Жыл бұрын

The more accessible you can make these for visually impaired folks. The more people can listen to these on their daily commute while their visual faculties are otherwise consumed.

@Dayanto Жыл бұрын

A trivial way to find a universal encoding is to simply compare the encodings for all the rotations and consistently picking the one with the lowest value. This always yields the same result regardless of the initial rotation. This requires 24x less space and lookups, but the tricky part is optimizing the encoding step to deal with all the rotations. The key here is that since we're only looking for one representation for all symmetries, we can aggressively prune other options as early as possible, before doing most of the heavy work. This has a theoretical upper bound of 24x faster than the original by disregarding irrelevant symmetries. In addition to this, you can represent the longest dimension as a bit field (uint) for an additional 5-16x speedup (when n=16) on allocations, encoding, comparisons etc. This might also be a bit more compact than a run length encoding. Note however that this requires special handling for equally sized dimensions since it only works in one direction. All in all, combining all of these tricks, you might in theory be able to speed up the algorithm by >100x. In reality though, it's probably much less on average.

@hanifarroisimukhlis5989 Жыл бұрын

My thinking is to "canonicalize" the shape such that for every rotation it resolves to the same shape (therefore same hash). It might be done by first rotating until (block sizes) x >= y >= z, then rotating each axis 180 degrees, minimizing center of mass coordinate. In the end the orientation will minimize center of mass coordinate, and since there ins only one way it can be done, it's canonical. EDIT: It's a bit moe efficient, since it only took 5 rotation at most rather than trying all 24 rotations.

@kindlin Жыл бұрын

I literally just got done typing this and scrolled down 1 more screen. And it would be much more efficient, as with a canonical rotation, you only compare 1 shape against a million others, not 24 different rotations against a millions others.

@randEveScrub Жыл бұрын

6 years ago I was asked to implement this in a 1 hour google interview... it has been haunting me ever since

@Philiquaz Жыл бұрын

Small thought for translation - this doesn't play a factor when there's only a few cubes, because those also fall under rotations. But translational symmetry will start catching you out once you're at 4x4x4 or larger. There's a lot of ground to be made in skipping over entire branches/systems that will always produce symmetric results. A comment earlier mentioned storing the moment of inertia as a quick hash for faster lookup - and another solution was aligning to a "primary" orientation to avoid rotational symmetries. It's a rough problem for sure.

@rmsgrey Жыл бұрын

He mentioned that he's already clipping the shapes to their bounding cuboid, which eliminates translation.

@MrSonny6155 Жыл бұрын

Of course, it will be a trade off between computing the orientation hash vs your poor sticks of RAM that will soon be seeing the lights from above.

@richardikenna1124 Жыл бұрын

Even when I don't get it sometimes, mikes enthusiasm makes it fun. Thank you

@marsovac Жыл бұрын

To account for rotation/flips use the hash that is "an array of distances of each cube to every other cube in the shape, sorted". This should be unique for each. If every distance of every building block to all others is the same, it should be the same shape, possibly flipped or rotated, at least in Euclidean geometry. For 20 cubes that would be 210 simple arithmetic operations + a sort on 20 numbers. Still probably much faster than doing the rotations.

@AkantorJojo Жыл бұрын

This would miss out on chirality. As "lefthanded" and "righthanded" shapes would give you the same numbers despite beeing different - as they can't be rotated into each other. But otherwise I really like the idea :D

@letsgocamping88 Жыл бұрын

@@AkantorJojosurely the left handed version of a shape is in fact a new shape?

@1paulromero Жыл бұрын

@@letsgocamping88not in 3d

@Bunny501 Жыл бұрын

@@1paulromeroYes they are different shapes. Your left hand can't be rotated to be the same shape as your right hand.

@Klisteristhashit Жыл бұрын

⁠@@letsgocamping88that is what @AkantorJojo is saying. But the proposed method, using Euclidean distance between the cubes, would not identify a right handed and left handed shape as different. Because all distances when sorted would be identical.

@coderentity2079 Жыл бұрын

Divide the cube that contains the shape into 8 cubes (half size on all axes), count the voxels in them, store the rotation that produces the smallest number and store the chosen rotation (0-23). Do the same for all 8 sub-cubes recursively. To find out if you already have a certain cube becomes easy: do the same dividing/counting/rotating with it and you can compare/search with a B tree or hash - when match found do this for the sub-cubes too, but remember to match the rotation too in them. You basically look for cubes with the same density distribution in finer and finer resolution.

@poorman-trending Жыл бұрын

Instead of hashing you can define a standard rotation and rotate each new object into that orientation before looking up. For example, as each array could be considered as a binary number, rotate each new shape 24 ways and choose the orientation with the smallest binary representation. Then lookup against the rest of the previously stored shapes. Not sure if this would be faster or not but its an alternative that may be.

@rsv9999 Жыл бұрын

had the same idea matey, i was thinking of storing the addition information for mirroring on an axis, meaning adding to that mirrored axis would cause the opposite side of the object to result in the same new object, reducing the need to search since you've already noticed that it's the same shape, a little caching really

@gone357 Жыл бұрын

@@rsv9999 what if you looked from the 6 faces and compared the proyections?

@xiexingwu Жыл бұрын

@@gone357in 3 dimensions, there may be holes in large structures that don't show via the projections.

@vladimirfokow6420 Жыл бұрын

Had exactly the same idea👍 This function takes in a shape of cubes, and outputs a number that uniquely identifies this shape. Any other rotation of this shape will produce the same id, and a different shape will definitely produce a different id. - This eliminates any collisions that the usual hashing can have. - And this is the way to produce a rotationally invariant id for the shapes. - (and if you also store the dimensions beside this id, you could reconstruct the object from the id - just reshaping this array from flat to having these dimensions)

@matsv201 Жыл бұрын

You can't really do that because before you checked all of them, you don't know what is the smallest one. A way around this is using COG as a way to orient the the part. They you would instantly know if it's the primary orientation.

@hurktang Жыл бұрын

Just brainstorming in here... We could describe all shapes by starting at a point and moving in all the dimension, so you can go (u)p, (d)own, (r)ight, (l)eft,(f)orward or (b)ackward. You can also re(w)ind a block back to start a second branch from a previous block. With those sets of 7 moves, you can describe all possible shapes. Then we can add building rules so you always go up first, then right, then forward, etc... and do the least amount of rewind possible. So all shapes would have 1 preferred code for it. The Z shape would be URRU and the T shape would be UURWL. All shapes could be described with this 8 base system. And, I might be wrong, but that might be easy to handle with the proper code. Only having to test 1 orientation per shape, having low memory cost in the list for shapes and above all probably be able to rule out a ton of shapes before even testing it against the list.

@MNbenMN Жыл бұрын

I was about to write down an algorithm to simplify the representations as a series of vectors, but realized it would only work (with implicit unique representations) for shapes that don't branch and don't form loops. I get why Mike said a bunch of people said it would be easy, but never got back with an algorithm.

@Imperial_Squid Жыл бұрын

Editing the wiki page in two weeks "All 17 polycube combinations were discovered by Mike Pound (via outsourcing to the computerphile community)"

@AF-lt2fr Жыл бұрын

If you assign the starting origin as 0,0 and apply the rules of no diagonals and must be connected - there's no need to do flips or rotations. It's only when adding the third dimension - e.g. 1,2 is not the same as 1,1,1. So for 4 you could have, 1,1,1,1 1,1,2 1,2,1 1,2,1 (2 not touching) etc. Since there's a maximum of 6 faces which could be joined to it there's a finite amount of faces you can attach to. E.g. for 1 cube there's 0 faces For 2 cubes - 1 face For 3 cubes 1 face + 2 faces 4 cubes As 3 + 4 Etc.

@ChiefArug Жыл бұрын

i love problem solving videos like these! please do a follow up if significant improvements are found

@yogeshsingla131 Жыл бұрын

Here’s my hypothesis without using brute force matching of all 24 variations. For the shapes, to match the number of cubes must be same. That is, the volume is same. If the shapes are same, their centre of inertia has to be same. Each cube is reduced to its centre and the entire shapes centre of inertia can be calculated and checked in 3D floating point space.

@EdwardChan.999 Жыл бұрын

The position of the center of mass as well. If you have an L shape, adding a cube to the top left or bottom left or top right would change the moment of inertia by the same amount, since it's just the sum of each mass times radius. But then there's the problem of tracking center of mass after rotation.

@speedstyle. Жыл бұрын

There are multiple shapes with the same bounding box, volume, and centre of mass. For example a 4x3 rectangle with empty insides ⠯⠽ and one with only the middle filled ⠽⠯

@pmmeurcatpics Жыл бұрын

@@speedstyle. Yes, but after using this rough filter what you'll be left with is a (hopefully, much) smaller number of cubes to search through

@speedstyle. Жыл бұрын

But it doesn't help with the rotation? and hash tables don't care how many options you're searching through.

@comradepeter87 Жыл бұрын

i was thinking something similar but much dumber. You just take the average of all the coordinates of all the cubes. Now, with this "centre" of the shape as the origin, write out the coordinate of each of those squares in a list, and sort it according to some order. Now hash this value, and it should remain the same for all the rotations and translations.

@aungthuhein007 Жыл бұрын

Absolutely love seeing Mike talking every time!

@slimeboos4065 Жыл бұрын

I have a suspicion that this could easily become an advent of code problem for 2023

@tails4e Жыл бұрын

This fear crossed my mind too 😅

@Revenga849 Жыл бұрын

day 18 of 2022 already has similar vibes with polycube surface calculation

@Zach010ROBLOX Жыл бұрын

Now I want to try this. First naive approach will be to think about the max space taken up by all orthogonal combinations (where a 3x3 would be a d8 shape of voxels/cubes) then use some theory to reduce the number of possible shapes that fit in that space (like dividing by number of symmetries). Lastly, the trick would be finding the number of diagonally attached shapes still possible in the space. Or some variation of these steps and other methods of removal.

@KuroOnehalf Жыл бұрын

I'd love more videos like this. Working through problems like this is fun.

@Winasaurus Жыл бұрын

"Now you've got a kind of t-shape, whatever that is called in tetris, I'm sure it's got a name" It's just called a t-piece. They're all named after the shape they look like so T, S, Z, J, L and I.

@oasntet Жыл бұрын

In before somebody cites that photoshopped tetris manual that was a meme four years ago.

@BeCurieUs Жыл бұрын

My brain instantly went to nodes vs actually representing these things in space. I guess the issue with nodes, though, is they lack the geometric identidy we need to identify similar looking shapes. The basic idea I had was you could limit the nods to 6 connections representing the 6 spatial directions in a cube. How to get all the info we want out of that configuration requires a bit more thought :D

@flake8382 Жыл бұрын

Love this problem, and I further love that it seems solvable for 3 dimensions. - solvable, as in regardless of how many cubes you use, it can be solved in some sort of consistent and logical timing pattern, like N, or N^2 or something like that. Solved as in a completed proof exists to confirm the algorithm never misses anything or miscategorises, and is as efficient and low weight as possible.

@0xdeadbeef444 Жыл бұрын

There's got to be a way to reduce this to a graph problem. Maybe storing the poly cubes as graphs/trees?

@jannicklumme Жыл бұрын

Thought about that too and then look up if the one in memory and the new one are isomorph. It might be that it is too slow due to graph creation overhead and isnt something like checking for isomorphism in O(n^2) or O(n^3)? Dont know the time complexity of his current strategy or other viable strategies

@aditya95sriram Жыл бұрын

I think converting to a graph might lose some information. For example, in 2-d, with 3 cubes, an L-shape and straight line might get converted into identical graphs right (i.e., straight path with 3 vertices)?

@nolifeorname5731 Жыл бұрын

@@jannicklumme I had the same idea and checked. Supposedly the current state of the art for general graph isomorphism is 2^O((log n)^3) EDIT: This problem actually has a bounded valence, so apparently it's solvable in polynomial time

@drdca8263 Жыл бұрын

@@aditya95sriram what about a decorated graph?

@mtardibu Жыл бұрын

That's my plan. I'm also planning on only developing the tree in the order of a BFS.

@forTodaysAdventure Жыл бұрын

1. start at 1 block and add a block to it at each face, then add a block to each one of those faces like you were saying, to build up the tree of possibilities 2. to hash the possibilities, start at the lowest block (the one closest to 0,0,0) and ask what combination of blocks does it have around it. for example, 1 neighbor is a 1, 2 neighbors on opposite sides is a 2, 2 neighbors touching is a 3. 2.5 attempt to map this to a hamiltonian path (one where you can travel to every block without backtracking) 2.6go to the next neighbor block (next neighbor is a touching block that is next closest to the origin, prioritizing closest to 0=x) 2.75 if you have to backtrack it, denote that in the hash by going into the decimals 3. convert that block to a hash and compare it to your hash table continue until you have n blocks

@CD4017BE Жыл бұрын

You can get a hash that is orientation independent by using an arbitrary hash-function and apply it for all 24 orientations of the shape and then compute the sum (of any other commutative and associative operation) of these hashes. Another optimization would be, instead of 3D arrays of bytes, to use 2D arrays of 16-bit integers (no axis will ever need more than 16 unless you break the record), storing the last axis in the individual bits. This both saves memory and makes comparison faster, but it makes rotation more complicated.

@d34dc0de Жыл бұрын

I was looking for a comment if someone had thought of this 😄My though process was more of computing the hash of the concatenation of all rotation as a binary sequence Edit: hexadecimal usernames ftw !

@1337GameDev Жыл бұрын

13:51 - Treat them like 1d, 2d and 3d shapes. They are VOXELS. So just use binary to encode the shapes, 1 or 0 for each voxel in the 3x3 cube. Then.... just perform a standardization on the data to use a gravity centric method to force 0,0,0 to be the gravity center. Then just "pull" on the center of mass of the voxel towards this, and choose minor edge cases for "equal" weighted edges/balanced voxel shapes (EG: [010][010][010] looking across the shape as a series of rows from top to bottom, front to back). Then you can converge multiple rotations into a standard rotation, then just hash all of them. WORKS well for parallelization too -- first process all normalized rotations, then process sorting based on the converged shapes and have a base ID given to each distinct shape and original input and record that when sorting. Then..... convert that code into CG and run on a GPU -- as this is well suited for matrix representation and operations. Essentially you have 2000+ gpu cores being able to execute this code on a 3080....

@seanm7445 Жыл бұрын

My first thought would be to save each shape at a specific rotation. Some common rule such as ‘close bounding box, put largest dimension at x, smallest dimension at y...’ So technically each shape will end up the same. Then for each check, you first put it into that specific rotation and then lookup the hashtmap.

@holthuizenoemoet591 Жыл бұрын

how would you find the common rotation, once a new object is made, what rotation does it have?

@matsv201 Жыл бұрын

Yea.. X>=Y>=Y, that eliminates 8 possibilities out of the 24. Ad COG values equal or larger than 0 for X,Y,Z then every object is in the same orientations as well as rotation. 24 is down to only one possibility non of the others even need to be checked, just discarded. That as a extra bonus, now they can be indexed by numbers, making them sortable. That will speed up the look up table but several orders of magnitude.

@kindlin Жыл бұрын

I had the same thought, but then also realized the obvious problem of square bounding boxes, that leaves a whole dimension (or two) undefined. I think the solution might be simple, tho, compute the center of mass (volume/cubes/etc) and if it's off center, use that direction as one of the axis, and if it's not off-center, reduce the bounding box you check consistently until it becomes off center, and if every cube of cubes is the same, then it's the same shape anyways.

@Spongman Жыл бұрын

Sean's question at 8:34 is right on the money. It's significantly cheaper to calculate the rotations at the time you add a new unknown shape than it is every time you query for existence. the added storage (24x) and lookup time would be insignificant.

@foo0815 Жыл бұрын

@@chrismanning5232 Lookup time is insignificant, 24 times Storage quite feasible, it's much less than the branch factor for one more cube

@Spongman Жыл бұрын

@@chrismanning5232 I wouldn't characterize "multiply by 24" as an explosion. sure, it's more. but it's a constant factor.

@happygimp0 Жыл бұрын

For the rotating problem: When you generate a new structure, rotate it in every way and store the resulting rotations as a integer (or list or string, doesn't matter that much). And then only use the one with the lowest (or highest if you prefer that) one. And then only use this one rotation to see if it is in the list of all generated structures and if not only store that rotation. Of course, even faster would be a way to only generate each structure once without keeping and checking a list of all generated ones. That would also help a lot for paralleling it on multiple threads or (a) GPU(s).

@MesaCoast Жыл бұрын

Easy way to reduce comparisons: all you need is a way to choose 1 "preferred" orientation of a given shape out of its possible rotations, and only check that rotation against your list. One easy way to do this is to read each possible rotation of the array as an integer (since it's an array of bits, it's already an integer, you're just reading it differently), and then choose the greatest integer. Then, check if you have that shape in your array already, if not, then add it. You can also order your shape storage by this integer to allow for binary searches, reducing the comparisons even further.

@joewakeling6867 Жыл бұрын

I was once looking for a rotation and reflection agnostic labelling convention for 2D grids of on/off cells. My system was to place the pattern in the smallest n×n square that would contain it. Then label the pattern by starting from one of the corners and going clockwise or anticlockwise recording each on/off cell with a 1/0, repeating the process in the same direction from the cell diagonally inwards from the cell you started from. The trick to uniqueness is how you select the starting corner and direction. In essence, this was chosen so that interpreting the label as a binary string gave the biggest number possible. So first, if you can translate the pattern at all without it leaving the square, try to put an 'on' cell in a corner so you can start from there. Failing that, you want the longest string of successive 'on' cells leading away from a corner, or if there are multiple equal length strings of successive 'on' cells, pick whichever is followed by the shortest string of 'off' cells before the next 'on', and continue in this way until there's a unique choice of translation, start corner and direction that maximises the value of the binary string, or multiple traversals have traversed the whole square and therefore have the same labelling(i.e. the pattern has some symmetry) Not sure if this is any more efficient than what others in these comments have suggested about just doing a hash for all rotations and picking the min/max. Suspect you could change my technique to use rectangles rather than squares and remove the translation question. Doubt you could use this very easily in 3D since that gives a lot more traversal options

@davidmurphy563 Жыл бұрын

There's an obvious way to reduce the lookup. You can use a vectorised encoding. Basically, the idea is just to ensure discarding the absolute position info: 1. Subdivide in half, pick the side with less boxes but with at least one. Repeat. Tree check to the least box box. Don't use recursion, just loop. 2. Describe the object with a vector path string. Up then right the forward. Make sure you cull the search, up then right means you can skip right then up. 3. Search this against your set. You see? We're basically storing a graph/vector so orientation info is discarded. If we use the same algorithm to select the starting cube and then subsequent cubes then rotated and flipped cubes will give the same output. Worst case there could be objects with have symmetry resulting in a reversed path but I'd just run the list and then cull it after. That should be easy to pick up.

@coderider3022 Жыл бұрын

I made a simple Rubix cube solver in python and hit these issues, needed a few colour coded xls files to understand it first. Glad the dr covered all my approaches. My goal was to learn python and see TTD options. (I’m c# / azure dev )

@skleon Жыл бұрын

I believe Zobrist hashing would be a very promising approach. It's what chess engines use to hash chess positions to build a transposition table. Given a previous shape with its hash, it's very fast to calculate the hash of a new shape that only has one more square in it.

@aditya95sriram Жыл бұрын

I looked up the Wikipedia article, it's an interesting technique. However, the hash does nothing to account for the different symmetries (rotation, translation, mirror etc) which makes sense because in chess you wouldn't really worry about those symmetries.

@BooBaddyBig Жыл бұрын

I think you'd be better off using a non random hash based on modulo arithmetic. That way multiplying by a number can be the same as translation. So each square has a number, and multiplying by 3 different prime numbers is equivalent to moving in the x, y, and z direction (modulo a larger prime number) respectively. Provided the prime numbers are bigger than the maximum shape and coprime, then the hash should be completely unique. You just add on the number for that square.

@bertiesmith3021 Жыл бұрын

If you store your cube as a bit array you can treat then order the various rotations and translations. Pick the smallest number and then that is the single unique description of the shape. This can be trivially hashed. And also the bounding box for n doesnt need to be be n by n by n - it only needs to be n x n/2+1 x n/4+1, since a valid solution which doesn’t fit will have a rotation or translation which does.

@SStefanovski Жыл бұрын

An encoding that accounts for rotation might be to just encode the shape from the 8 corners of the bounding box and add them all together, making the rotation problem an addition problem which is faster for the CPU to compute. If you label each cell of the 3D coordinate space with a prime number, starting from one corner and ending in the opposite corner then you could represent a shape as the product of the primes. If you do the same but start counting the primes from the 8 different corners and add all the products together you get the key for that shape to lookup in the hash table. Not sure if it would work in practice though.

@pierreabbat6157 Жыл бұрын

A couple of suggestions: *Write it in Julia or Haskell. Both are compiled to machine code, hence fast to run, and also fast to develop in. *Categorize the polycubes by how many of each color of cube they cover in something like a checkerboard.

@gianluca.g Жыл бұрын

The topology of the shape is rotation and specular reflection invariant. I don't have an encoding in mind but my gut feeling is that we should encode the topology of the shape and hash that to see if there is a match. This might be a graph problem. Here's an idea, I just had that while writing the comment (rubber duck effect?): The set of possible places to extend the shape can be pruned out by not considering places that will result in a mirrored/rotated extended shape. This doesn't avoid the problem of rotating/mirroring the shape but at least the test is perfomed on just the current state of the shape vs the extension places.

@TheFinagle Жыл бұрын

For checking rotations one thing you could do is Come up with a way to 'score' the shapes, the 'lowest scoring' rotation or reflection is the main form. For each new shape you only need to check the 24 reflections and rotations to see which scores lowest, then check if that specific form is in your list. One simple way to score is to shift it against the grid axis and string out the cells as the digits in a binary number. Every possibility of cell arrangement would have a unique number.

@cheako91155 Жыл бұрын

1. An array of bool is a big integer. 2. Whatever value is smallest is the only representation of this shape that you operate on. 3. When during a rotation the value increases or would increase, the rest of that rotation can be skipped. * Because you would never operate on a larger version of the shape. 4. After a rotation that shape's value must be smaller, then the previous version can be forgotten. 5. You don't need to use smallest value if some other scoring function would work better.

@cheako91155 Жыл бұрын

I think if you did this you can even prevent yourself from creating "large" rotations in the first place, can't you slice the work area along a diagonal plane? That is a true/1 in half of the locations(in a cube of cubes) would seemingly to me have a smaller rotation.

@dano6187 Жыл бұрын

Several comments suggest using the hash to define the canonical rotation for each shape. However, that requires producing all the rotations and rotating the shape is perhaps the most expensive operation here. Instead, we could define the canonical rotation in a way that makes it easy to decide if adding a cube would produce a shape that is not in the canonical form and if there exists a different path to reach that shape without requiring a rotation. For most shapes there will be leaf cubes that have only a single connection. Each of these leaves define alternate paths to reach the shape. If any of these alternate paths was preferred over the current path we could ignore the current path with the assurance that the preferred path will eventually be discovered. Leafs cubes connect to branches and branches eventually connect to core cubes that form loops. The set of leaf cubes plus core cubes not directly connecting branches or leaves represents every possible path to reach the current shape from a shape with 1 less cube. If we can identify a single best possible path to reach this shape then it would not be necessary to save this shape for future tests for duplicates since only the addition of the cube from the best possible path would be counted and further expanded. Only the shapes that have multiple equivalent best possible paths would need to be tested for duplication. If is is possible to always define a unique best possible path we could eliminate all duplicate testing and therefore the memory limitation on the maximum size. This would also facilitate parallelization. However, that is a big “if”. Rather than iterative deepening where all shapes of a size are saved before searching for shapes of the next larger size, it may be better to generate the smaller shapes on the fly. A benefit in addition to saving storage for the generated shapes is that complex data defining the shapes could be built iteratively as each cube is added.

@zxuiji Жыл бұрын

7:24, paused here for a moment, to me the math seems a bit simple, I don't remember what it was but I remember there being some sort of formula for generating the number of unique sequences that can get given N numbers - so for example 2 can generate 2, XY & YX, 3 can generate 6, XYZ, XZY, YXZ, YZX, ZXY & ZYX - I would use that on the result of width * height * depth, at worst would be only a few % out off from the actual number. **Edit:** btw the rotations could be dealt with rather simply, find the longest length of the shape and set that as the central axis, next you set the shortest radius of the remaining as another axis along that axis (for shapes like L where it needs to be flipped one way just use the longest hypotenuse between the shortest radius and the each half of the central axis lengths to determine which way is "up"), then finally you should have 1 orientation to check against. As for the check of "had this before?" just have multiple arrays, one for each cube count, you can use that count to quickly get to just the shapes of the same cube count (which should be a small number) and check against those. For optimising the array allocation you can use the formula that I mentioned above to pre-allocate enough slots for the majority and just generate shapes from cube count, incrementing until you hit width * height * depth, setting each new array as an offset of the original

@isbestlizard Жыл бұрын

Looks easy. Each new cube can only attach to one of 6 faces of each of the existing cubes in the shape, so long as there isn't a cube already there. The interesting bit will be pruning each new level to remove duplicates, so you'll end up with a massive tree structure, each leaf having at most 6 new leaves beneath it.

@haxi52 Жыл бұрын

Generate a cube of ints, size = max(x,y,z). Start in the middle and spiral out incrementing. These are weights. Multiply the weights by the shape and all them all up. Rotate the shape and do it again. The rotation that gives min is the one you hash. If the shape is symmetrical you may get duplicate mins, which kinda proves it works.

@Ceelvain Жыл бұрын

This video definitely has a Matt Parker vibe to it. Giving it a go and seeing what happens. The only difference is that Mike focuses on the programming side, while Matt would focus on the maths.

@user-pw5do6tu7i Жыл бұрын

Instead of generating candidates by extending the shape off every face, I think we could limit it by applying some rules. Only generate Right, Back, Down (+x, +z, -y). So a tetris S peice never checks for a tetris Z peice. And I think ontop of that you could only branch off the last added cube. The recursion will build them in a way slightly counter intuitive to us. But they should all be checked

@user-pw5do6tu7i Жыл бұрын

Instead of thinking geometrically, with a growth factor of ^n, think of it like a breadth first search.

@theirishpizzaguy6663 Жыл бұрын

he is right

@Imevul Жыл бұрын

My brain directly went "Well that's going to be annoying to represent in memory", and the remaining 15 minutes just confirmed that. I was thinking maybe you could represent the shape as a branching path through 3D space, where you would just say "from the previous cube, the next cube is 'forward' and the cube after that you turn 'up' and that will be the new forward", but I still don't think that will solve the rotation issue since it would depend on where in the shape you start pathing, and how you define "up" or "left" etc. from that point.

@KipIngram 9 ай бұрын

Ok, here is what I will try if I have a go at this. I want to shove my block of cubes down as close to the origin as I can, still having all the cubes be in the first octant. There are six faces that can be against the y/z plane - I choose based on bringing as much mass as possible close to that plane. Then there are four ways to orient around the x axis - I'll choose that to bring as much mass as possible closest to the x/y plane. Then hash it. The main rub I see here is breaking ties (when multiple orientations will put the same maximum mass close to whichever plane I'm considering). I'll have to give that some further thought.

@davidwipperfurth8465 Жыл бұрын

When you generate a shape for the first time, generate each rotation, then put all the rotations together (add them or put them all side by side using some predictive orientation rules, whatever is computationally easy and distinct) and hash the group of them. Now you have a hash that represents every rotation of that object, use that hash as the key in a dictionary. When you generate a new object, make it's rotations, hash the group of rotations and check it against the dictionary of group hashes. With this method, you do still have to compute rotations, but you only have to compute rotations once; you don't have to check each rotation against the list. Another approach would be to compute some fast characteristic stats about a shape when you generate it (like, how many boxes have 6 neighbors, 5 neighbors, etc.) and use those stats to make an elaborate sorting tree. You know your shape can't look like other shapes if they don't share similar characteristics, so by following the tree branch for "this shape has *1* box with 6 neighbors" you immediately eliminate most of the searchable space, and every branch you follow from then on further reduces the searchable space in a roughly exponential fashion. It should be easy to calculate stats like neighbor count if you are starting from a lower order shape and adding one box, as you just take the stats from that shape and make a small tweak to reflect the one box you added. You could probably even figure out how to use the stats as a hash so you aren't even walking a tree or writing tree code.

@xenoMorten Жыл бұрын

Speaking of Python and numpy, I once made an algorithm that used numpy types, and while benchmarking it I noticed it spent alot of time doing odd numpy-things, then reimplemented it using basic types and it brought around a significant increase in performance. Cool story, from the real world, I know!

@leo_carlini Жыл бұрын

Amazing ! I'm looking forward to a follow up video !

@alikaperdue Жыл бұрын

More efficient? - always grow from a placed block only into the positive quadrants. Anything that grows to the left must be a rearrangement of the object that grows to the right. - imagine the smallest box around the piece. When creating a new piece this box has sides a

@jamesalewis Жыл бұрын

My initial inclination on the optimization front is to turn the 3D array into a big int in all its possible rotations and store the lowest value, then look up that big int in the list of found shapes. Now, this would turn into a 5k-bit datum for a 17x17x17 array of bits (4913), but in reality it would be a triangular array for a 2D (n*(n+1)/2 cells), and a right tetrahedral array for the 3D case (n*(n+1)/2*(n+2)/6 possible cells), which cuts the number of cells into half and a third respectively, so that huge int for a 17-cube array can be encoded into less than 1kb (969 bits, precisely), which could work in a lookup. You may like to take the SHA-2 of the 1kb and turn it into a 256-bit datum, just counting on no collisions. That SHA-2 hash can be used as indexes to lookup the actual shape for double-checking. If you want to get it up to 32, for a nice power-of-two, your size would be just under 6k (5,984), about 5.5x fewer cells.

@Pystro Жыл бұрын

I think the fastest strategy here would be to generate all the valid shapes, but completely *eliminate the lookup* if you've seen it before. Instead you calculate the number of different ways there are to build up that shape. This turns the problem from "Find every possible shape, and count the first unique instance fully but every consecutive instance don't count." into "find every instance, and count them all in a way so that all variations of one unique shape sum up to 1". Furthermore, you'd be able to factor in the symmetry of the shape. (For example, the "L" shape in 2D that he uses as an example can appear in 8 *different* orientations, so you divide by another factor of 8, while a 2x2 square in 2D can only be in 1 unique orientation so you divide by an additional factor of 1.) The easiest way I can see to count every instance is to add cubes one at a time, because then the question "how many ways are there to build this shape" becomes "in how many orders can you add these cubes without having disconnected clusters along the way?" Then again, i just realized that if you do the "populate a grid with ones and zeroes and check if connected" strategy, you'd be answering the question of "how many shifts are there to fit this shape into the grid" which is even easier since you merely have to count the amount of zero-padding along the 3 axes and multiply those numbers. It would also be great for multi-threading because it's what's called "embarrassingly parallelizable". The problem with the byte array representation is that the number of representations for 17 cubes would be a 17*17*17 byte array, i.e. 2^(17^3)=10^1479 candidate shapes, of which only a tiny fraction are actually valid. It might be smart to treat the trivial 1*1*N and the still rather simple 1*2*(N-1) cases separately, so that your representations can shrink to (N-2)^3. Or impose an ordering on the dimensions to make it N*(N/2)*(N/3); Or probably even both for (N-2)*(N/2)*(N/3)=10^217. Or don't place your 1's into a cube shape, but an N^3 octahedral space (10^291 candidate shapes). The best I can think of is to use an octahedron truncated to (N-2)*(N/2)*(N/3) in order to combine the space that is actually reachable plus the axis ordering.

@sdgvlkjnasdlfkjawlk Жыл бұрын

Because the number of polycubes explodes to ~60 Billion at only n=16 improving the lookup when checking for duplicates seems more important than any other operation. - My primary idea is rather than creating a new polycube then checking whether it has been seen before we "look backwards." The new algorithm would be to create a new polycube of size n, then generate all polycubes of size n-1 that can be created by removing one block, and remove those from the queue. I don't know how well this could be multi-threaded though. - There are about half as many polycubes if you combine reflections. Making use of this fact would shrink the lookup time and memory requirements at the cost of more work to generate the storage representation of each cube. - Many others have already said it: The best rotation (and reflection) canonical representation I can think of is to test all of them and choose the bit string with the lowest value. This would allow you to create the full hash as n, the bounding box dimensions, and then the bit string (with leading zeroes deleted to save space) plus a count of unique reflections.

@RodrigoCespedesDaza Жыл бұрын

I think the key here might be creating a path finding algorithm where you touch all the cubes in an arbitrary shape in such a way that the entry point to the object is deterministic. So if you have a set of rules (like an automata) that describes how to traverse every cube in the object, you could traverse it from every starting point and have a set of unit vectors describing each step in the traversal; next step would be a sorting algorithm for each set of vectors based on rules that are deterministic (like, (1) you always go to the right if you can, (2) you only turn around if there's no unexplored cube available from the point you're standing, etc) If you can sort the sets of vectors deterministically, then you can determine an entry point to the object deterministically, regardless of the object's orientation. If you have an entry point, then all you need to do is hash the set of vectors that traverses the object from that starting point. Given this approach, the key is that your traversal algorithm must generate all possible sets of vectors for traversing the object in fewer iterations than the number of ways to rotate the object 😓. It feels like generating a set of rules similar to traversing a tree data structure will yield the set of vectors quickest this approach might have a best case scenario Ologn time complexity to deterministically generate a hash So looking up a deterministically generated hash might get you to a faster algorithm, I think this approach would reach a local maxima once N === "the number of possible rotations of an object", as long as N is lower than that number, this has possibility of being more performant

@happygimp0 Жыл бұрын

If i understand you correctly. Create a small list for every rotation of a object (moved so that at least one cube has the x coordinate of zero, at least one cube has the y coordinate on 0 and one cube has the z coordinate to 0. Maybe this is a single cube that has more than a single coordinate at 0). And then use the maximum or minimum value of this list and only use this rotation for comparison and storing? It would get even better when you could only build every structure once and don't need to check if you have it already created before.

@GuentherJongeling-hu6oe Жыл бұрын

A quite obvious idea to make it faster (maybe) would be to store all rotations of the cube thingies as individual items of the hash table, that way you don't have to compare all rotations of new cube combinations to check whether the combination exists somewhere. This is of course at the expense of memory.

@letsgocamping88 Жыл бұрын

But then how do you order the rotations within the table to get the same hash?

@MNbenMN Жыл бұрын

The memory expense could not be much more understated. I think the memory for hashtable for the 16-part polycube shapes would end up larger than 60TB with that approach. That's a lot of RAM!

@BooBaddyBig Жыл бұрын

To check for uniqueness, you only have to check the lowest (or highest) rotated hash number for any given shape, so storing them all in the hash table is pointless. Because they're all equivalent.

@itellyouforfree7238 Жыл бұрын

This is EXACTLY what the interviewer suggested in his question. Did you not watch the video?

@matsv201 Жыл бұрын

You don't even need to do it. Just make sure that the COG is in one direction, and you know you have the only possible rotation. Not even a need to do the search.

@AuratticStride Жыл бұрын

One simple way to manage the hashing is when you add the object to the list the first time, hash the object in all possible orientations. Then, when you're looking for the object you only need to hash once, as if it exists it should be in one of the 24 previously found orientations

@wbfaulk Жыл бұрын

As suggested by Sean in the video at 8:34, to which Mike commented that you're trading compute time for memory consumption.

@erikstv7802 Жыл бұрын

Or even just keep track of e.g. the minimum of those 24 hashes: Then your hash table won't be any larger. You could perhaps do a lookup already after having done a few of the rotations to stop that loop early (depending on how slow the rotations are versus the lookups). But since there's logics and non-numpy loops here, I expect this exercise to perform better with numba than with numpy (while working within the Python ecosystem).

@theOtherNism Жыл бұрын

I had a similar idea, but it would be more efficient to pick, from the 24 byte sequences that represent the rotations, the one that has the smallest value when represented as a number. Then store the hash of that. (although I'm not sure if storing the hash instead of the original byte sequence has any advantage.)

@royvanrijn Жыл бұрын

Wrote some code to calculate the shapes and my runtimes are (in Java). I'm using a canonical way to store the "smallest" of each of the 24 rotations. This "smallest" is just arbitrarily, as long as it's consistent. I've found that just using vectors (0,0,1),(0,1,2) etc was easier to work with and faster than using bitmaps. Especially when combined with a kind of "snake" 3D index number from a space-filling curve. In my numbering system all numbers are positive (the shape is placed in the (0,0,0) origin), so that is index 0, than (0,0,1)=1, (0,1,0)=2, (1,0,0)=3, (0,0,2)=4 etc. This way storing the entire shape is a series of very small numbers, unique and compressable. This is the current runtime/speed (the counted shapes seem to match the expected values): Written file shapes-2: 1 Last run took: 12 ms Written file shapes-3: 2 Last run took: 3 ms Written file shapes-4: 8 Last run took: 2 ms Written file shapes-5: 29 Last run took: 3 ms Written file shapes-6: 166 Last run took: 7 ms Written file shapes-7: 1023 Last run took: 25 ms Written file shapes-8: 6922 Last run took: 108 ms Written file shapes-9: 48311 Last run took: 544 ms Written file shapes-10: 346543 Last run took: 3828 ms Written file shapes-11: 2522522 Last run took: 30524 ms Written file shapes-12: 18598427 Last run took: 255812 ms Written file shapes-13: 138462649 Last run took: 2406329 ms Written file shapes-14: 1039496297 Last run took: 20646680 ms (stopped it here for now, it is saving to disk, could continue indefinitely)

@LearnedSome Жыл бұрын

The problem described in this video is a 3D modeling problem. Instead of numpy, you should create a boundary representation for each shape. And then, if you created a complete shape descriptor, you could discard duplicates by making it a unique index in your target database table.

@Txoka Жыл бұрын

A way of encoding it such that the same shape rotated ends up being an equal representation is to just calculate the center of mass of the shape and select the rotation that minimizes it (or maximizes it, or moves it to whatever corner you want, it doesnt matter as long as its the same for every piece). Should be pretty straightforward. Of course first you need to match x=x y=y and z=z for the cut container of the shape. So you only have to rotate it if there is multiple ways of doing that to start with.

@DavidBerglund Жыл бұрын

I love this corner of the internet. I just shut down the TV to think about this problem for a while and read all the interesting comments here. Thanks, humans ❤

@Dudleymiddleton Жыл бұрын

Back in 1980 (I think - it was a while ago!) My mum was a primary school teacher and she brought home a bbc micro with cub screen from the school over the summer hols to learn how to use it - I found this app called "Build" ( I think!) and it created this block lattice based structures navigated by cursor keys, I got hooked on it, I was about 13 -14.

@mctuble Жыл бұрын

What I like is his way of making it interactive. Follow along if you want. Share your results. Someone's bound to have a brilliant idea.

@kaynex1039 Жыл бұрын

Mike doesn't bring up the biggest issue. There are 29 *billion* polycubes of size 16. If your algorithm stores each polycube, you won't have enough memory for n = 16.

@aterceiracoisa_cortes Жыл бұрын

You could store the number of ones in each plane for every dimension. In a 3x3 grid an I shape would look like: 111-030 (1 one in each row, and three ones in the center column, meaning the figure is centralized and verticalized). For every unique shape, each set of values (corresponding to all hyper-planes in each dimension) contains always the same values, regardless of how the figure is positioned in the grid, and so the sets for two distinctly positioned, but topologically equal shapes, would be permutations of one another. Essentially this means we can reduce the number of elements representing the shape by a factor of N where N is the length of the smallest dimension we consider. Not sure if this solution is faster, but it works and is rotationally invariant. Also, this is highly parallelizable. The comparison function is basically a counting function. I'm sure this "hashing" function produces no false negatives (i.e. no equal shapes will compare as false), and in my five minutes of tinkering I've been unable to find any false positives (i.e. distinct shapes comparing true).

@nenhard Жыл бұрын

There are false positives in dimensions n>3. I had a similar idea. Perhaps it could work if you cut and partition a shape into small ckunks each max 3x3 size.

@aterceiracoisa_cortes Жыл бұрын

@@nenhard yeah the idea is to reduce by one dimension (2D into a 1D array, 3D into 2x2 planes), I believe I did not make it clear in the original comment but that's the gist of it

@nenhard Жыл бұрын

@@aterceiracoisa_cortesMy idea was to use tomography.

@Vixikats Жыл бұрын

After being presented with the problem I immediately thought that whether the binary representation of the data could tell you about its orientation and I'm pretty sure it can, at least for 2 dimensions. For example, given the pattern for a 2-dimensional square: 1 0 1 1 1 1 1 0 0 We can produce 4 different rotations, 2 90 degree rotations and a 180 degree rotation. If we take the row by row binary representation of all of the flips, we get this: Normal: 101 111 100 90⁰ Right: 111 010 011 180⁰: 001 111 101 90⁰ Left: 001 111 101 If you take a look, you can notice something immediately apparent. Both the 180 degree and 90 degree rotations are exact binary mirrors of their normal and opposing 90 degree counterpart. They are exactly the same except the order of the bits is reversed, which makes sense. Which means we really only have 2 different binary representations for the pattern... or do we? No- we actually do. At first I was looking through some samples and thought that there was some clever bit rotations and at first it seemed like I was correct, for example, look at the example above: 101 111 100 111 010 011 We notice the pattern I was initially looking for if we just take a look at how the bits are clustered together. This actually turns out to be simply a rotation of bits of the main configuration. The 90 degree rotation operation is nothing more than a left or right bit rotation of some degree. In this case, you can get this through EITHER (Normal ROTL 5) or (Normal ROTR 4). ***HOWEVER*** It turned out that this ability to simply rotate bits was just a fluke and that it doesn't actually hold true for all configurations. So it appears as though we really do have 2 correlated, yet still independent bit strings that we can mash together in some fashion and throw into a hash function that a hash table can use. One important thing that you have to do is that you cannot have any padding bits, so no empty rows or columns before or after the configuration. This turns out to be very simple, honestly because it's very easy to get the absolute bounds of a configuration of cubes and we can simply use that to offset and normalize everything and ensure that we're only reading relevant bits. I HAVE YET to see how this works in 3 dimensions and what data I might find, but I will continue to research and test and see whether this remains true. But maybe this'll be an interesting jumping off point that someone else might be able to use while I'm working on my own things. I am assuming that given the 24 total rotations compared to the square's 4, we're going to end up with either 6 or 12 total bit configurations that can describe all possible rotational transformations of a cube.

@coaster1235 Жыл бұрын

I would suggest using arrays, but keeping track of the bounding box of the polycube in the array (as in, the minimum and maximum coordinate in each dimension that the polycube occupies). That way, rotations can be done as modified array lookups, and you can use the bounding box information in the lookup to pretend one corner of the bounding box is at (0,0,0) and in the positive quadrant/octant/etc.

@jonaskoelker Жыл бұрын

On the theory that memory bandwidth may become an issue, I would feel sorely tempted to handle at least the small cases in a packed word. For example, for n=7, the maximal volume is 3x3x3 = 27 [imagine 3 arms meeting at (0, 0, 0)]. Pack 27 bits inside a 32-bit word, '1' meaning 'there is a cube here' and '0' meaning 'no cube here'. Use two bits for the size along the smallest dimension and another two bits for the size along the second smallest dimension. For n=10 the maximal volume is 4x4x4 = 64. In all cases 10

@timschulz9563 Жыл бұрын

First question: Is there an rotation/translation invariant representation of these shapes? Second: Is there a way to build new solutions based on the answer of the first question? I asked myself: If someone gave me a box with that thing glued into it, is it possible to know the shape? Can we use mass or/and something like inertia tensor. So is there a physical way to determine the body without looking at it. Or use the center of mass as some sort of reference for an orientation in 3D space.

@elraviv Жыл бұрын

to solve the orientation problem. let's say you had a function that decide for each shape what is its' representative orientation. that way you would only need to compare and save one orientation. here is such a function: take the 24 orientations of the 3D array of 0s & 1s and treats each one of them as one long number. the representative orientation is the one that gives you the lowest number, and there should be only one. bonus fact now you can order the shapes by this value looking up a number in an ordered array is O(log(n)) true you may have to deal with a few thousands bit long numbers, but you could use run length code, as you did to store the shapes.

@PeterBeresford-hf9iz Жыл бұрын

Rather than using matrices I represented a shape as a Python list. The enclosing box of a shape is a list of layers, each layer being a list of rows, and each row is an integer where each bit is a column, set to 1 if filled by a cube or 0 if empty. E.g. the 2 shapes for N=3 are [[7]] and [[1, 3]]. Adding a cube to an N-shape simply changes the row value, if internal to the box, or adds a new layer, row or column if external. Thus adding a cube to the side of [[1, 3]] gives [[1, 3], [0, 2]]. Given a candidate shape, each of the 24 views is a new list, obtained by rotation, flip or tilt. Valid shapes have all 24 lists added to a set, and if a new candidate is in this set it is discarded immediately. The count of shapes for N=10 took 10 minutes to calculate on my laptop. Extrapolation indicates that the time for N=12 would be about 24 hours. Mike's comments on Python's built-in hashing are borne out by this speed. All is achieved by manipulating lists (which are mutable) and a list's string representation is used for the immutable entry in the set. Works a treat.

@dancingCamels Жыл бұрын

Rather than storing rotations as hashes you could store planes in a group. If the planes are the same then i believe it's the same shape, regardless of orientation. To simplify it you could then remove any duplicate planes - e.g a set of sets

@SuppaflyZSM Жыл бұрын

I feel like there is probably some way to use vector math for this that mathematicians would immediately recognize and know how to handle.

@gougz99 Жыл бұрын

Good idea! Maybe create all vectors starting from the center of each cube to the center of all the other cubes, then sum them or something.... and multiply by matrices for rotations

@JoQeZzZ Жыл бұрын

I don't think so. It doesn't feel like an affine problem for me, since there can be holes like a u shape. You could mangle it into a vector problem by sort of serialising the 3D space, but that's just a transformation you can do after you've found a solution. The only smart thing you can probably do with vectors is an easy way to rotate and/or mirror found solutions through matrix multiplication, but that's only syntactically clever and wouldn't speed up the process computationally

@freee8838 Жыл бұрын

Gpus...

@samanthaqiu3416 Жыл бұрын

lattice math is inherently hard, otherwise it wouldn't be a credible cryptographic scheme

@sanidhyas3s Жыл бұрын

If we are just identifying shapes, where their position in the entire grid is insignificant, we can just take the rectangle/cuboid outside which there isn't any 1 so something which covers all 1s but nothing extra, now flattening that into a 1-D array, we get a binary number which we can use as the hash.

@sanidhyas3s Жыл бұрын

Upon more thought I realise that this doesn't have any sense where to turn in the flattened number, so maybe a different hash of the cropped cuboid can be more useful.

@jounik Жыл бұрын

One easy way to account for rotations would be to consider them an equivalence class and provide a canonization. I'm thinking something like having a well-defined comparator operation for the RLL encoded pattern, calculating the pattern for all the 24 rotations and only storing and hashing the smallest one.

@daveduncan8090 Жыл бұрын

Why worry about repeated shapes first? Calculate the number of times each shape of N▪︎ can fit in the cube after finding it. If there's a pattern there, you can use that to reduce the total to uniqueness and just first determine the maximum number of 16,17,18 piece shapes can fit.

@mjlavall Жыл бұрын

Update your encoding method to include all rotations of the shape in a well ordered way, so as soon as you add a shape to your hash set, the hash is calculated from all possible rotations of the shape. The next time you encounter a shape, no matter the orientation, its encoding will match the one already in the list. You will have to calculate all rotations for each shape, but only have to do one lookup.

@Akronymus_ Жыл бұрын

Has been mentioned in the video. And wouldn't really work, as it explodes the memory footprint and ruin the cache.

@matsv201 Жыл бұрын

It's way simpler to just have all objects in the same orientations and that is possible for most objects with a xyz and a COG xyz order. This way most objects can be discarded even prior to search.

@1337GameDev Жыл бұрын

7:20 - Well... maybe you can normalize, and force every query shape to be rotated to have a specific kind of symmetry / orientation, and then you ONLY need to check for ONE rotation for checking for repeated shapes. eg: use a gravity principle, and force the rotation to calculate center of mass and mass distribution of the 1s, and then rotate so the "heavier" section is on the bottom and do the same for each axis (to eliminate mirror shapes).

@tdug1991 Жыл бұрын

My first thought would be to have each square/cube represent a power of two, and the hash value of each shape would be an integer, up to n^3 bits. You'd store/lookup the minimum [or maximum] of each of the 24 rotations. You still need to do all rotations on each shape, but you only need to compare them to a hash table that you could binary search, if constructed as an ordered list. I think this only saves a CPU/memory factor of 24. The hashes would also be unique per-generation, so you'd have to use the max grid size for the smaller shapes, or at the very least, recompute them for each generation on the new grid. For n=16, this would be 512 bytes per hash, if stored with perfect efficiency. There may be a more efficient way to determine a "primary" rotation of a set of cubes, or perhaps a way to generate fewer duplicates in the first place.