"Unison: a new distributed programming language" by Paul Chiusano

Рет қаралды 42,524

Strange Loop Conference

Күн бұрын

Пікірлер: 68

@KevinONeillPlus 5 жыл бұрын

The most interesting thing I've seen in some time. I'm looking forward to watching this evolve.

@hellwolf4liberty Жыл бұрын

If you think about it, many smart contract platforms are also content-addressed code, e.g. EVM. The difference is the data/code availability and immutability is backed by a consensus algorithm.

@mpcref 5 жыл бұрын

I've always wondered why there are no languages that work like this! It makes so much more sense on so many levels. Especially when combined with IPFS.

@xDeltaF1x 5 жыл бұрын

I think this misses an important part of the dependency problem, which is that system D can't write any code that passes Employee objects between B-methods and C-methods since they're still different types. The benefit of this hashing approach could also be solved (albeit less gracefully) by namespacing couldn't it? Also in the serialization example, you still need to write code to interop between V1 and V2 of your list, it just means that you have API calls labelled by hash instead of version number.

@aDifferentJT 3 жыл бұрын

That is mentioned in the talk, it’s actually not the annoying part of the dependency problem though, the annoying part is when you get conflicts despite not needing the interop.

@frankshank4513 5 жыл бұрын

Is there a typo in the distributed merge-sort (35:05) should lines 5 and 6 be: resL = at spawn ‘(dsort lte left) resR = at spawn ‘(dsort lt right) rather than resL = at spawn ‘(dsort lte left) resR = at spawn ‘(dsort lte left) or have I completely misunderstood the parallel recursive call magic?

@PaulChiusano 5 жыл бұрын

Yes, that's a typo. Sorry about that! Will fix in published slides.

@PaulChiusano 5 жыл бұрын

Here's link to the slides (with that fix): slides.com/pchiusano/unison-strange-loop-2019#/

@LoganDark4357 3 жыл бұрын

As the talk goes on it starts raising more and more questions that just aren't answered. For example, during the talk about how serialization works, I feel like it's just a fucking nightmare that you have no idea what type you could get out of a file. It could be your Employee type from today or your Employee type from five years ago, or it could be a completely unrelated type. You either have to declare up front every single one of your Employee types, or deal with the possibility of the deserialized result not being one. Also, how are errors handled? I/O and side effects in general? A file made five years ago should not be stuck in the past forever. This language looks like it's successful regardless which means I'm probably missing something, I would love to know how (if) this is solved.

@asdfghyter 2 жыл бұрын

There are plenty of tools for updating old references to use the new versions. I don't know exactly how the storage system works, but I assume you would write migrations similarly to how you do with traditional databases

@thatgotofinal 5 жыл бұрын

And what if some function had some implementation error and I need to change how it works? hash will change so it will be a new function and all the code will be still broken after publishing new version of library?

@ryantomczik4916 5 жыл бұрын

Everything that depended on that function would be "updated" as well. I put updated in quotes because its actually creating new functions. So would the entry point into the program so you would be running a different program at that point that would all point to the new version of your function.

@PaulChiusano 5 жыл бұрын

This was a very common question after and I wish I'd had a slide on it - here's longer reply about how it's done: twitter.com/unisonweb/status/1173942969726054401

@jeremykothe2847 4 жыл бұрын

I admit I'm new to this, but doesn't that mean you're swapping the 'codebase-wide analysis' problem with a 'codebase-wide-hash-propagation' one?

@ajuc005 3 жыл бұрын

@@ryantomczik4916 basically if you change how a common function works you have to change 99% of your code :) It's automatic so that's nice but still you have a lot of work for little benefit.

@Rallion1 9 ай бұрын

Dependency conflicts would be preferable to me than having two different versions of the same dependency floating around. It gives a clear and hard error preventing me from adding code which might not be compatible with my existing code. If im operating on a type of record called an employee, and then suddenly a function i call returns something that i had expected to be that type, but was actually something else, that could cause unexpected results. A depencency conflict prevents that scneario from ever happening.

@vittoriuz Жыл бұрын

THIS is the future

@blenderpanzi 4 жыл бұрын

Allowing different versions of the same library might solve some problems, it also creates others if the library has global state. But I guess since this is a functional language there is no global state? Then there is no problem. If you wonder what such problems might be: In a really early version of nodejs I wrote some tiny web server and since the version of the mime library I could access was a different version as the one the web server library could access I could not feed the web server library some mime types that the library didn't cover built-in. Was just impossible without hacks/forking libraries.

@saltrocklamp199 3 жыл бұрын

Or restrict "global" state to be "module-global". Or just don't have global state like you said :)

@videojeroki 5 жыл бұрын

so you could basically delete the *.u files? how the code documentation is managed ? no build, no compilation, no binaries ?

@blenderpanzi 4 жыл бұрын

And what will be put in git? Some binary ast file you've never wrote? Does it require you to use it's own system for version control, so you can't integrate it in a project that uses other languages?

@skepticmoderate5790 4 жыл бұрын

No build doesn't mean no compilation. There would still be compilation.

@DurgaswaroopPerla 3 жыл бұрын

How is scope maintained? Same variable can exist in various scopes, can't it?

@perfectionbox 5 жыл бұрын

renaming may not break code but it will break developer training because now they must learn a function's new name when writing new calling code. Also, how well is this going to work with source repositories e.g merge/diff

@bryku 3 жыл бұрын

We will have this problem with any language, but from what I understand both names should still work, they just point back to the same thing.

@julkiewitz 5 жыл бұрын

Sounds like it would be a huge security issue to have a system keep all those cached previous implementations around. What if one of the previous implementations contains some sensitive information?

@julkiewitz 5 жыл бұрын

In general, it seems like this would allow a whole host of similar attacks like the ones used to exploit caches in processors (Spectre, Meltdown, etc.)

@RalfVogler 4 жыл бұрын

You have the same problem with git now. Need to rewrite history then. Or just don't commit sensitive information (there are also tools for token scanning).

@anug14 5 жыл бұрын

Other languages could also implement this right ?

@ryantomczik4916 5 жыл бұрын

twitter.com/jaredforsyth/status/1173763944323764225

@bibliusz777 4 жыл бұрын

this is sth really similar to what i've tried to implement for years!

@Voltra_ 3 жыл бұрын

All caches evolve with the code right? Otherwise regressions won't be discovered by unit tests

@louisthibault555 2 жыл бұрын

Is there any mechanism for garbage-collecting unused function/type definition files? How does that work?

@csebastian3 5 жыл бұрын

Really great ideas.

@Rallion1 9 ай бұрын

Based on the demo, you dont actually keep source code - to see the source code at any point you have to effectively decompile it... I cant imsgine what that workflow would be like, it doesnt *sound* like it would be particularly enjoyable.

@thatgotofinal 5 жыл бұрын

And serialization will include a hash from this language? this is basically what everyone is always trying to avoid - serializing data using build in serializers. As we want data to be agnostic to used language so it can be later read by different app in different language

@llothar68 5 жыл бұрын

You misuse serialization as external exportation.

@jeremykothe2847 4 жыл бұрын

@@llothar68 exporting from one system, importing to another. With the option one day to upgrade/change out one of those systems for a newer one. Misusing?

@aleksandrpetrosyan1140 5 жыл бұрын

It has an ML-like syntax. I wonder how complete the language is?

@PaulChiusano 5 жыл бұрын

The current syntax is inspired by Haskell, but because we store the code as its AST, it would be very doable to have multiple syntaxes for it. See this note www.unisonweb.org/docs/language-reference#a-note-on-syntax and we are tracking "pluggable syntax" proposal here: github.com/unisonweb/unison/issues/499

@starrychloe 5 жыл бұрын

OMG this is like IPFS for functions. Not sure how useful.

@saltrocklamp199 3 жыл бұрын

Imagine using IPFS for package distribution!

@n8style 5 жыл бұрын

Have to admit when he first mentioned storing code as hashes I thought WTF would you do that?! But then after 5 minutes of hearing how it's basically treating code the same way a functional language treats data, I'm sold!

@berndeckenfels 5 жыл бұрын

De-serialisation attacks become easier if you can inject your types hash (I guess even if pure is enforced?) 29:20

@n8style 5 жыл бұрын

Open Sourced, Statically Typed, Functional.....God I love it already take my money!

@dimitargeorgiev188 5 жыл бұрын

Great talk! How does `force` get away with having type `Future a -> a`?

@dimitargeorgiev188 5 жыл бұрын

I'm guessing due to abilities Future a -> a does not mean what I think it means, I'll try to read up the docs

@skepticmoderate5790 5 жыл бұрын

It's a prefix function that takes a future and returns the value that it receives. Seems pretty straight forward to me.

@Ondra011 4 жыл бұрын

@@skepticmoderate5790 I believe the question wasn't "what does it do?" It was "how does it accomplish that?"

@superoven 3 жыл бұрын

I assume it uses some kind of monad. It's encapsulating an impure function action in a context that allows you to make functional guarantees about how it runs. At least that's how I understand the concept

@starrychloe 5 жыл бұрын

How does it compare to Elixir?

@JanVladimirMostert 5 жыл бұрын

which hashing function will be used? and i'm guessing if you ever get a duplicate hash, it'll cause havoc?

@JanusTroelsen 5 жыл бұрын

since there is no mention of security, i am pretty sure all nodes have to trust each other. so yeah, data is trusted to be correct, and since hashing is used to identify functions, if a collision is found, it would break. but the other node can just lie about the result. if they didn't care about collisions, they wouldn't need a hash-function, they could use checksums. though it is unclear if they talk about cryptographically secure hash functions or not, though they are so fast now that i guess it is what is meant.

@bert88sta 5 жыл бұрын

512-bit sha3 hash, so probably not a huge concern for collisions. Also I'm sure they thought of something to deal with potential collisions

@DanLien 5 жыл бұрын

Paul mentioned 512-bit SHA-3 hashes around 3:45

@agsystems8220 5 жыл бұрын

A hash collision might cause havoc, depending on how good the type checking is and whether the collision also is a signature collision. The protocol also has tests as a integral part, so errors should be catchable, and tracible. You can run tests descending the execution tree until you find the wrong component. The protocol could even be set up to add some sugar to one of the colliders automatically and rebuild where required. There is very little reason not to use a cryptographic hash for this. The amount of data it is required to operate on is relatively tiny, with a programmer probably outputting a few kilobytes that needs hashed on an average day. Changing something deep in the call stack could require a significant rehash, but that should happen very rarely.

@k3ys 5 жыл бұрын

Seems like hash poisoning is a potential problem too. Could I write a function which has an unused tweaked array of values in it which causes a collision on purpose and inject my malicious code?

@pfeilspitze 5 жыл бұрын

Why does this need a new language? The "I compiled it and stored the compiled form based on the hash" is something you can do in any language...

@ryantomczik4916 5 жыл бұрын

twitter.com/jaredforsyth/status/1173763944323764225

@agsystems8220 5 жыл бұрын

Hash naming is the innovation that gives the language it's strength, but not the only feature. It has other features that play to that strength. Being able to easily differentiate between lazy and eager code is important. A language that is clear about the dependencies of every line of code is required for easy implementation. The test cases are important for the protocol, so need to be part of the language. Immutability and a functional approach to programming are almost required to make this work well (can't easily share a mutable variable), so a language that supports that is important. A type system that can handle distribution well is required, and it needs to be able to treat code as a type. He is not just presenting the innovation, he is presenting the language. Whether it does it's job better than a more established language with hash naming tacked on is up for debate. It may be the case that the language's only claim to fame is the introduction of hash naming. Their starting point was not producing an innovation though, it was writing a language tailored to solve a problem. No point abandoning the language because they found a technique applicable elsewhere. The language might still prove to be the best way to approach the problem.

@dandymcgee 4 жыл бұрын

- Doesn't replacing every symbol increase the size of the codebase by 50-60 times (assuming ~10 letter variable/method names)? I assume you're at the very least doing some dictionary compression on the SHAs, but that dictionary is still a very large amount of overhead. - What happens when you refactor something to the same name as something else? The SHA will be the same, and even if they're not presumably a human has to read this code at some point which means replacing SHAs with symbols information at which point there may be a name collision and the code will no longer compile?

@mechadense 4 жыл бұрын

Multiple occurances of the same hash should allow for decent space optimizations. This is not possible with what we are currently (2020) still are moving to which is as many docker containers (and other such hacks) as there are installed apps which contain mostly the identical libraries over and over and over again. Maybe not x60 but on my system I woudn't wanna bet on it with me trying out so much new software.

@skepticmoderate5790 4 жыл бұрын

The hash isn't computed based on the name. It's computed based on the AST, so renaming DOES NOT change the hash.

@toxicore1190 5 жыл бұрын

I don't get, how renaming a function doesn't break others code, as they still have to refer to that function in code

@sofia.eris.bauhaus 5 жыл бұрын

because function aren't referred to by name but by hash. as i understand it, the names are just used for display and initial input. i guess confusion could arise from people using different names to refer to the same function (or vice versa) but that's more of a social problem between programmers, the code itself doesn't care..