Thank You Chad for your contributions to Algo Trading community. My Respects !
@sudhanshusingh-cy9wp2 жыл бұрын
you are, i think one of the better engineers of the trading world, really learning a lot from you
@Alexrmacleod11 ай бұрын
Thank you so much for making this video, truly helpful content, you’re making the world a better place
@jackroark69282 жыл бұрын
⭐❤brilliant video!
@grizthegriffdog99392 жыл бұрын
brother !! thanks so much. Truly well done piece. Timescale for the win!
@rickybrenay62492 жыл бұрын
This is great I’ve been searching where to scrape everything but sell to make a range pattern. I have a few indicators I wanted to backrest from TV
@maverick456-332 жыл бұрын
OutStanding!
@kaustavdatta47488 ай бұрын
Thanks for another awesome video. Do you also use TimescaleDB as your feature store? If not, how do you recommend efficiently storing (or generating real-time) hundreds of features or technical indicators generated from ohlcv series?
@mchilde17652 жыл бұрын
How to get the changes of the aggregated data change from timescaledb in real time like websocket, instead of querying at intervals
@asuras340910 ай бұрын
Great content! Thank you very much. I'd just like to know how I could combine this with historical ohlc data for a backtesting database that stays up to date?
@ChadThackray10 ай бұрын
You can either insert the historical candles straight into the same table as the aggregated ones if they're the same timeframe or have a separate table and query across them both
@som3k5712 жыл бұрын
Great Video! Really appreciated! A question I have is.. How would you deal with e.g. a lost connection? There would be data missing. I am dealing with a websocket which is only providing the latest data. But there is a REST API, which would allow me to fetch the missing data. How would you assure, that the database is always up to date?
@matthewoconnell6291 Жыл бұрын
Yep, you hit on the downside of this approach. The answer is that you would spend more and more of your time - instead of focusing on your strategies - replicating all the infrastructure to handle such scenarios that the likes of Binance are giving you for free.
@santhoshdasari72 жыл бұрын
Thanks for sharing…
@kannansingaravelu Жыл бұрын
Great video! Will the files stored in database persists even after docker is stopped and re-run again?
@ChadThackray Жыл бұрын
Yes, it will be saved to the folder that you mount and persisted between restarting the DB
@kannansingaravelu Жыл бұрын
Thanks @@ChadThackray also could you suggest how do we access the the database (from the container) outside of the host system, say if we want to access it from another laptop / desktop that may not be on the same network.
@ChadThackray Жыл бұрын
@@kannansingaraveluThe same way you would access any other database. The exact details depend on your networking setup
@SarfarazAhmedAbbasi Жыл бұрын
How do we merge historic data as well as raw trades data. For example I create this system today and start collecting raw trades data from today but for historical data I want to download the minute data once and combine. How do I combine?
@grizthegriffdog99392 жыл бұрын
hey Chad, I got another question. My ETHBTC trade data is quite slow, and ohlc_minute_data often has 2 to 5 minute gaps in data. For the life of me I can't figure out where I should resample the data? I thought forsure the continuous aggregate would forward fill the data right in the db, but it doesn't seem to. And makes sense to simply ffil() the df once we request it, but I can't seem to get that to function properly either. Any advice?
@grizthegriffdog99392 жыл бұрын
Not sure if this will work or not, but in the fetchklines func I had to reset the index as datetimeindex on the date column, then drop old date column, and asfreq fill na. I tried an alt method using the time_bucket_gapfill in timescaledb but wasn't able to get it function correctly in my query. Just not savvy enough! Anyway, if you can think of a better way than below, i'd sure love to hear it ! all the best. Incase you don't realize what im trying to do, if you look @56:53 in your vid, you see the time jump from 14:06 to 14:10 on the ETHBTC pair since no trade data for 4 minutes. Just want to make sure when I calc indicators incase that happens that there is data forward filled from last data point for the missing minutes. data = cursor.fetchall() df = pd.DataFrame(data, columns = ['date','price']) df.index = df['date'] df.drop(['date'], axis =1, inplace=True) df = df.asfreq('1Min').fillna(method='ffill')
@ChadThackray2 жыл бұрын
I haven't re-watched the full video to check, but I suspect that gap in the data is because I stopped the script that was streaming trade data whilst I worked on it. That script needs to be running 24/7 Are the trades being correctly inserted into your database? You can query the trades table by time descending to look at the most recent ones. If they're in the DB, the continuous aggregate should work. If they're not it's a problem with the data gathering
@grizthegriffdog99392 жыл бұрын
@@ChadThackray trades are correctly inserted. The problem is sometimes 10 minutes go by without an update on the 'ETHBTC'pair. If there was an upadate every minute it would be fine and you wouldn't notice a problem. If you check that timestamp i posted you can see the jump in your data too on that pair. I think there is a solution where we use 'time_bucket_gapfill function in tsdb with the locf() (last observation carried forward) but I couldn't figure out how to use it properly from the docs. So instead I used my hack there to resample the data asfreq('1min') and ffill the na values. but im sure it can be done with timescale more efficiently.
@grizthegriffdog99392 жыл бұрын
@Chad Thackray also the gaps are in raw_trade_data and then continue to be in the ohlc_data_minute from there. Again, has to do with gaps on low volume pairs. Naturally, this may not be a huge issue if one was using kline streams , but would still be nice to find a solution. Thanks again! @56:53 in video you can see the gap in data pulled for ethbtc pair
@ChadThackray2 жыл бұрын
@@grizthegriffdog9939 Yeah you're right. I assumed ETHBTC would be higher volume so there wouldn't be any gaps. In that case the approach with the gapfill function is correct if you need to store every 1m bar in the db.
@lakshit6429 Жыл бұрын
I heard it many times that lots of quant firms uses 100s of GBs of store every single day, I don't understand where and how they use such a large amount of storage?
@bicker17022 жыл бұрын
hi, I bought the data science course in the last couple of days and was wondering if you would do some updates including models like risk metrics and etc
@ChadThackray2 жыл бұрын
Hey Vitor. Could you send over a few examples?
@benpowerhaus2 жыл бұрын
The risk metrics you are refering to are made with machine learning. They are pretty complex, I doubt it can be simply covered in a course. But who knows maybe Chad is able to figure it out
@jessojoy88992 жыл бұрын
Does using celery for streaming provide any added advantage?
@ChadThackray2 жыл бұрын
Depends on how much data you're ingesting. If you get big spikes of data it can even out the workload. If you're just getting data from one exchange / a few tickers you shouldn't need it
@progtuts60712 жыл бұрын
Do you by any chance have idea how to take in and process tick data of all 400+ coins in real time?? I cant come up with a solution for the life of me
@ChadThackray2 жыл бұрын
Does it fail/bottleneck when you put all the names of the coins in? If so you could consider the async version of the socket manager which I think should be more performant, or put celery in front of it like someone else suggested.
@progtuts60712 жыл бұрын
I will look into it, thank you
@jessojoy88992 жыл бұрын
Hi, eventhough I create database and table in docker after I exit the container and go to psql I can’t find the database. Any idea why the connection is failing
@ChadThackray2 жыл бұрын
if you want to access the database it'll be easier to use psql inside the container. You should be able to access it from outside the container, but you'll need to make sure you enter the correct address, port, username and password as appropriate
@jessojoy88992 жыл бұрын
@@ChadThackray yeah I am able to access psql within the container and when I check database and tables I created are there.but when I simply run my python script, I get database not found error
@cyberpunk29782 жыл бұрын
Is TimescaleDB faster than MongoDB/HDF5, etc. for financial time series data?
@ChadThackray2 жыл бұрын
That's a contentious issue. Ultimately it depends on what you're doing. Automatic aggregations is why I use timescale
@rohitkumar-nk6sd2 жыл бұрын
Facing error: decouple.UndefinedValueError: API_SECRET not found. Declare it as envvar or define a default value. Though it is named as .env only and placed in root folder Kindly help me resolve it.
@ChadThackray2 жыл бұрын
Are you using a jupyter notebook? If so that sometimes causes errors with decouple. If you search for `Jupyter notebooks break decouple's default config`, there is a solution on github