Knowledge-Based Systems, TU Dresden

Пікірлер

@onitchanyerhovwo5448 Ай бұрын

This was so good and worth the watch, perfectly explains the concepts of wikidata editing. Thank you

@SelmaPreferParis Ай бұрын

very helpful, this is exactly what i'm looking for, thank you!!

@haonanqiu4251 4 ай бұрын

thank you!

@sassydesi7913 4 ай бұрын

Great lecture

@HawthorneMegan 4 ай бұрын

Brown Frank Walker Charles Robinson Scott

@AlejandroGarcia_elviejo 5 ай бұрын

4 years later... Thank you for this series it has helped me understand many questions I had about RDF...

@lpanades 6 ай бұрын

In the example of Coffee the verb part "born in" could have multiples answers. I ask if would be a better expression for restrict the answer (concept). I saw an example about using Neo4j that the matter was basket ball players and an interesting example: "A PLAYER" "played for" "A team". and the "played for" had an attribute of "contract value" and this attributes lives on the link and not on the node. I had never saw that and would like when it is a case of putting attributes on links and when put on nodes. Does exist some king of normalization?

@thegeniusfool 6 ай бұрын

As soon as a language is intriguing, the folks get German or French accents.

@alibagheribardi5326 9 ай бұрын

"One of the most efficient ones that I have ever encountered."

@fezkhanna6900 10 ай бұрын

absolutely fantastic

@kellymoses8566 Жыл бұрын

I'm using Neo4j to model computer networks (routers, switches, servers, etc) and the ability to get the full path is incredibly useful.

@emc3000 Жыл бұрын

Thanks so much for all these helpful videos man.

@bayesianways4114 Жыл бұрын

Thanks Professor, very informative lecture

@bayesianways4114 Жыл бұрын

Thanks for the session

@shirishbendre Жыл бұрын

….and no practical example?

@edwardmitchell6581 Жыл бұрын

This is what I was looking for. Are they even worth learning?

@AD-ox4ng Жыл бұрын

The practical examples come later in the course. Neo4j is one type of graph database discussed further on to study its implementation of knowledge graphs.

@hans-juergenphilippi1977 Жыл бұрын

Entertaining, profound, and helpful. I'm a Markus Krötzsch fan boy by now. :-)

@vrjb100 Жыл бұрын

Creating IRIs, are point 4 and 5 contradictory? Don't use http(s) unless there is a real webpage on the url. Otherwise use none as the scheme? Or skip the iri grap and simply use the package name thing from Java. At the top level there could be one namespace, so labels are short and human readable.

@hans-juergenphilippi1977 Жыл бұрын

"Doubles are a different kind of beast..." 😀 Hahaha, yes, confirmed!

@hans-juergenphilippi1977 Жыл бұрын

Regarding the language of literals: is there a way to declare some kind of default language for a RDF graph? That is, instead of denoting "Hello"@en with each one you could just use "Hello" and it is implicitly assumed to be English?

@hans-juergenphilippi1977 Жыл бұрын

Especially like the coffee filter example graph! 🙂 And for the first time ever, I saw the *real* purpose of # anchor chars in IRIs explained. Even the RDF(S) specifications simply use them without explaining this background. Great!

@dogaarmangil Жыл бұрын

9:27 It's worth noting that with RDF-star and SPARQL-star, Wikidata's misalignment with RDF has been resolved.

@ศิวศิษย์แสงนิกุล Жыл бұрын

Thanks!!! Some basic i dont know about it but I... try to learn that to learn this!!!

@ben_jammin242 Жыл бұрын

I think pseudo code would have been better. This custom syntax is more mathematical, coming from a programmer.

@moellerseo Жыл бұрын

Wonderful!

@andrewcbuensalida Жыл бұрын

It would be better if you showed examples of each

@andrewcbuensalida Жыл бұрын

You should have done a side-by-side comparison of sparql and cypher for the whole video, not just the first example.

@andrewcbuensalida Жыл бұрын

anyone else having a hard time keeping their eyes open while watching?

@kellymoses8566 Жыл бұрын

It is dry material but I found the 3 views of property graphs to be very interesting.

@andrewcbuensalida Жыл бұрын

It would be more clear if you gave examples

@andrewcbuensalida Жыл бұрын

whoever invented this made it too confusing

@chuanqisun Жыл бұрын

That's one heck of an opening for the lecture. Love it!

@chuanqisun Жыл бұрын

Thanks for uploading this series. The KZbin subtitle shows that the source language was Germen, so to get English subtitle, it translates from German into English with a lot of errors. Is there a way to tell KZbin that this lecture is actually in English?

@kutihijacker Жыл бұрын

Many thanks for your great and interesting lecture!

@abogdanov Жыл бұрын

One of the best lectures I've ever heard online!

@bartkl 2 жыл бұрын

Thanks a lot. Great material.

@a0um 2 жыл бұрын

Very good coverage but I can't help to think this could have been presented in 30 minutes or less.

@a0um 2 жыл бұрын

As a software developer with some experience in functional programming, I would have found intuitively acceptable that shortly before 48:30 the "table with one row and no columns" was represented as a set containing an empty tuple: {()}

@a0um 2 жыл бұрын

I love the topic and the presentation seems thorough. However, as someone with background in programming, it feels a bit long winded, and I think that introducing the RDF prefixes earlier would have made the slides easier to read.

@jonaskoelker 2 жыл бұрын

At 43:16 it is asserted that answering datalog queries is P-complete, yet there are queries in P that cannot be answered. Instead of P-complete, should the slides say P-hard? That is, every problem in P can be reduced to a datalog query, but not every datalog query can be reduced to a problem in P? Also, I'm guessing here-is the interpretation of "P-hard in data complexity" the following: "for every decision problem p1 in P there exists a datalog query plus ruleset (q, r1, ..., rn) and a log-space turing machine T which outputs datalog facts (not rules or queries) such that for every instance x of p1 it is true that rundatalog(q, r1, ..., rn, output of T(x)) returns a non-empty result set if and only if x is a yes-instance of p1"? I guess for exptime completeness the turing machine T' outputs a mix of facts, rules and a query and you rundatalog(T(x)), i.e. the query and rules can be a function not just of the problem type (e.g. 3-SAT vs. 3-colorability) but also a function of the particular instance.

@knowledge-basedsystemstudr9413 2 жыл бұрын

Right, so let me answer to the second two paragraphs first. Your idea of what P-hard for data complexity (and Exp-hard for combined complexity) means seem to be correct. The actual rules that are used to put this into practice mimic the computation of a Turing machine in Datalog. For the most part, this is not too hard (we often describe Turing machine computations using rule-like statements anyway, as in "if the machine is in state q and sees symbol a, then it changes to state p, writes a b, and moves to the left"). To make this work in Datalog, the main idea is that the relational structure (database) that gets computed should look like a "trace" of the TM run: the tape (memory) will be a chain-like structure where every element is a cell, and we will have a new tape for every moment in time (so every time point, from start to end, is in the model). The challenge is to come up with a good encoding for things like "cell 23 at time step 42". If the computation is not too long (polynomial), we can simply assume that the relevant coordinates (like "23" and "42") are given to us with the input and we merely create pairs of them during the computation. That's the idea for the P-hardness proof. If we want to show Exp-hardness, we need to create a lot more "cells" in some way. The idea for doing this is to use facts with long lists of parameters (e.g., if I have 10 parameters in a predicate and 2 constants, I can already represent 2^10 facts). Indeed, the number of facts one can get is exponential, but the exponent is given by the arity of predicates. For Exp-hardness, the input should determine the exponent, so the arity of predicates must grow if we want to process larger input instances. This is why this is no longer "data complexity". Other than this complication (which turns "addresses" of cells into lists of constants), the rules that describe the actual Turing machine behaviour are basically the same as in the PTime case. If you want a more formal explanation, the complete Datalog programs needed for these encodings can be found in a 2001 survey paper "Complexity and Expressive Power of Logic Programming" by Evgeny Dantsin and colleagues. It should be easy to find and free to access online. Now to the first question. It is really true and there is no error in the video here. Datalog is P-complete for data complexity, yet there are problems over databases that can be decided in PTime but not with a Datalog program. A simple example of the latter would be: "Find out if the database does not contain the fact q(a)". There is nothing contradictory in this situation. P-hardness (for data complexity) just means that we could find a "cheap" reduction from any problem in P to Datalog query answering (with a fixed rule set). The ability (or, as it is here, non-ability) to express all problems in a complexity class directly (without any reduction) is known as the *descriptive complexity* of a database query language. If we can express all problems in P with a query language, then we say that the language *captures P*. As for Datalog, it does not capture P, but what it captures is still difficult enough to be P-hard. If we want to capture P, we can do that by adding some more features: input negation and successor ordering. I have some more details on these in my course on database theory (iccl.inf.tu-dresden.de/web/Database_Theory_(SS2022)/en Lecture "Datalog Complexity"). A more detailed explanation of what "expressing" a query in a query language is given in our work "Capturing Homomorphism-Closed Decidable Queries with Existential Rules" (Camille Bourgeaux et al. , it's about a different query language, but the introduction to expressivity applies to Datalog just the same).

@jonaskoelker 2 жыл бұрын

@@knowledge-basedsystemstudr9413 > If the computation is not too long (polynomial), [...] Right, I think I understand. I think we could do the same in a less direct way: the Cook-Levin theorem (3-SAT is NP-Complete) builds a boolean circuit bounded in space and time which simulates a polynomial-time Turing machine. Instead of asking "is there an input such that <blah>" we're asking "is <blah> for this particular input", since we're working with deterministic p-time TMs. I think I can figure out how to convert logic gates to Datalog. We could also reduce directly from the circuit value problem (which is P-complete). I guess the datalog is something like: canOutput(GATE, true) :- hasType(GATE, and), hasInput(GATE, X), hasInput(GATE, Y), canOutput(X, true), canOutput(Y, true). canOutput(GATE, true) :- hasType(GATE, or), hasInput(GATE, X), canOutput(X, true). canOutput(GATE, false) :- hasType(GATE, and), hasInput(GATE, X), canOutput(X, false). canOutput(GATE, false) :- hasType(GATE, or), hasInput(GATE, X), hasInput(GATE, Y), canOutput(X, false), canOutput(Y, false). [canOutput if hasType(_, not) is left as an exercise.] ... and then the reduction on "true and not true" produces: canOutput(gensym1, true). canOutput(gensym2, true). hasType(gensym3, not). hasInput(gensym3, gensym2). hasType(gensym4, and). hasInput(gensym4, gensym1). hasInput(gensym4, gensym3). ... and this database can be computed in log space by walking over the input string repeatedly. Maybe I need to insist that the circuit is presented in a way that's easy (enough) to parse. Since no input value canOutput both true and false, it follows that no gate can output both true and false, and thus by induction canOutput(the circuit's output gate, true) is in the database iff the circuit value is true, so that shall be the query. [The "fixpoint of (database <- database union stuff-we-can-derive-in-one-step)" then is the wavefront of evaluated gates, terminating once all gates have been evaluated.] Also, upon thinking about it I now understand my initial confusion: emptiness of the answer set (the criterion used in reductions) is not the same as an arbitrary predicate about the database being true. [Of course if you can evaluate the predicate in polynomial time there exists a family of derived datalog programs with non-empty query responses exactly for those instances where the predicate is true, by the above reduction. But that's a very different statement.] Thanks for taking the time to write a thorough answer. [I bookmark this if I want to chase down the exptime-completeness proof.]

@kellymoses8566 Жыл бұрын

@@knowledge-basedsystemstudr9413 This might be the smartest comment on KZbin.

@tiagolubiana3898 2 жыл бұрын

Amazing video! Thank you so much for this series.

@Xean0hrt 2 жыл бұрын

Geile Vorlesung Prof!