TimescaleDB for Algotrading - Full course in Python

Рет қаралды 20,305

Chad Thackray

Күн бұрын

Пікірлер: 41

@srinivasmarkandeya2829 25 күн бұрын

Thank You Chad for your contributions to Algo Trading community. My Respects !

@sudhanshusingh-cy9wp 2 жыл бұрын

you are, i think one of the better engineers of the trading world, really learning a lot from you

@Alexrmacleod 11 ай бұрын

Thank you so much for making this video, truly helpful content, you’re making the world a better place

@jackroark6928 2 жыл бұрын

⭐❤brilliant video!

@grizthegriffdog9939 2 жыл бұрын

brother !! thanks so much. Truly well done piece. Timescale for the win!

@rickybrenay6249 2 жыл бұрын

This is great I’ve been searching where to scrape everything but sell to make a range pattern. I have a few indicators I wanted to backrest from TV

@maverick456-33 2 жыл бұрын

OutStanding!

@kaustavdatta4748 8 ай бұрын

Thanks for another awesome video. Do you also use TimescaleDB as your feature store? If not, how do you recommend efficiently storing (or generating real-time) hundreds of features or technical indicators generated from ohlcv series?

@mchilde1765 2 жыл бұрын

How to get the changes of the aggregated data change from timescaledb in real time like websocket, instead of querying at intervals

@asuras3409 10 ай бұрын

Great content! Thank you very much. I'd just like to know how I could combine this with historical ohlc data for a backtesting database that stays up to date?

@ChadThackray 10 ай бұрын

You can either insert the historical candles straight into the same table as the aggregated ones if they're the same timeframe or have a separate table and query across them both

@som3k571 2 жыл бұрын

Great Video! Really appreciated! A question I have is.. How would you deal with e.g. a lost connection? There would be data missing. I am dealing with a websocket which is only providing the latest data. But there is a REST API, which would allow me to fetch the missing data. How would you assure, that the database is always up to date?

@matthewoconnell6291 Жыл бұрын

Yep, you hit on the downside of this approach. The answer is that you would spend more and more of your time - instead of focusing on your strategies - replicating all the infrastructure to handle such scenarios that the likes of Binance are giving you for free.

@santhoshdasari7 2 жыл бұрын

Thanks for sharing…

@kannansingaravelu Жыл бұрын

Great video! Will the files stored in database persists even after docker is stopped and re-run again?

@ChadThackray Жыл бұрын

Yes, it will be saved to the folder that you mount and persisted between restarting the DB

@kannansingaravelu Жыл бұрын

Thanks @@ChadThackray also could you suggest how do we access the the database (from the container) outside of the host system, say if we want to access it from another laptop / desktop that may not be on the same network.

@ChadThackray Жыл бұрын

@@kannansingaraveluThe same way you would access any other database. The exact details depend on your networking setup

@SarfarazAhmedAbbasi Жыл бұрын

How do we merge historic data as well as raw trades data. For example I create this system today and start collecting raw trades data from today but for historical data I want to download the minute data once and combine. How do I combine?

@grizthegriffdog9939 2 жыл бұрын

hey Chad, I got another question. My ETHBTC trade data is quite slow, and ohlc_minute_data often has 2 to 5 minute gaps in data. For the life of me I can't figure out where I should resample the data? I thought forsure the continuous aggregate would forward fill the data right in the db, but it doesn't seem to. And makes sense to simply ffil() the df once we request it, but I can't seem to get that to function properly either. Any advice?

@grizthegriffdog9939 2 жыл бұрын

Not sure if this will work or not, but in the fetchklines func I had to reset the index as datetimeindex on the date column, then drop old date column, and asfreq fill na. I tried an alt method using the time_bucket_gapfill in timescaledb but wasn't able to get it function correctly in my query. Just not savvy enough! Anyway, if you can think of a better way than below, i'd sure love to hear it ! all the best. Incase you don't realize what im trying to do, if you look @56:53 in your vid, you see the time jump from 14:06 to 14:10 on the ETHBTC pair since no trade data for 4 minutes. Just want to make sure when I calc indicators incase that happens that there is data forward filled from last data point for the missing minutes. data = cursor.fetchall() df = pd.DataFrame(data, columns = ['date','price']) df.index = df['date'] df.drop(['date'], axis =1, inplace=True) df = df.asfreq('1Min').fillna(method='ffill')

@ChadThackray 2 жыл бұрын

I haven't re-watched the full video to check, but I suspect that gap in the data is because I stopped the script that was streaming trade data whilst I worked on it. That script needs to be running 24/7 Are the trades being correctly inserted into your database? You can query the trades table by time descending to look at the most recent ones. If they're in the DB, the continuous aggregate should work. If they're not it's a problem with the data gathering

@grizthegriffdog9939 2 жыл бұрын

@@ChadThackray trades are correctly inserted. The problem is sometimes 10 minutes go by without an update on the 'ETHBTC'pair. If there was an upadate every minute it would be fine and you wouldn't notice a problem. If you check that timestamp i posted you can see the jump in your data too on that pair. I think there is a solution where we use 'time_bucket_gapfill function in tsdb with the locf() (last observation carried forward) but I couldn't figure out how to use it properly from the docs. So instead I used my hack there to resample the data asfreq('1min') and ffill the na values. but im sure it can be done with timescale more efficiently.

@grizthegriffdog9939 2 жыл бұрын

@Chad Thackray also the gaps are in raw_trade_data and then continue to be in the ohlc_data_minute from there. Again, has to do with gaps on low volume pairs. Naturally, this may not be a huge issue if one was using kline streams , but would still be nice to find a solution. Thanks again! @56:53 in video you can see the gap in data pulled for ethbtc pair

@ChadThackray 2 жыл бұрын

@@grizthegriffdog9939 Yeah you're right. I assumed ETHBTC would be higher volume so there wouldn't be any gaps. In that case the approach with the gapfill function is correct if you need to store every 1m bar in the db.

@lakshit6429 Жыл бұрын

I heard it many times that lots of quant firms uses 100s of GBs of store every single day, I don't understand where and how they use such a large amount of storage?

@bicker1702 2 жыл бұрын

hi, I bought the data science course in the last couple of days and was wondering if you would do some updates including models like risk metrics and etc

@ChadThackray 2 жыл бұрын

Hey Vitor. Could you send over a few examples?

@benpowerhaus 2 жыл бұрын

The risk metrics you are refering to are made with machine learning. They are pretty complex, I doubt it can be simply covered in a course. But who knows maybe Chad is able to figure it out

@jessojoy8899 2 жыл бұрын

Does using celery for streaming provide any added advantage?

@ChadThackray 2 жыл бұрын

Depends on how much data you're ingesting. If you get big spikes of data it can even out the workload. If you're just getting data from one exchange / a few tickers you shouldn't need it

@progtuts6071 2 жыл бұрын

Do you by any chance have idea how to take in and process tick data of all 400+ coins in real time?? I cant come up with a solution for the life of me

@ChadThackray 2 жыл бұрын

Does it fail/bottleneck when you put all the names of the coins in? If so you could consider the async version of the socket manager which I think should be more performant, or put celery in front of it like someone else suggested.

@progtuts6071 2 жыл бұрын

I will look into it, thank you

@jessojoy8899 2 жыл бұрын

Hi, eventhough I create database and table in docker after I exit the container and go to psql I can’t find the database. Any idea why the connection is failing

@ChadThackray 2 жыл бұрын

if you want to access the database it'll be easier to use psql inside the container. You should be able to access it from outside the container, but you'll need to make sure you enter the correct address, port, username and password as appropriate

@jessojoy8899 2 жыл бұрын

@@ChadThackray yeah I am able to access psql within the container and when I check database and tables I created are there.but when I simply run my python script, I get database not found error

@cyberpunk2978 2 жыл бұрын

Is TimescaleDB faster than MongoDB/HDF5, etc. for financial time series data?

@ChadThackray 2 жыл бұрын

That's a contentious issue. Ultimately it depends on what you're doing. Automatic aggregations is why I use timescale

@rohitkumar-nk6sd 2 жыл бұрын

Facing error: decouple.UndefinedValueError: API_SECRET not found. Declare it as envvar or define a default value. Though it is named as .env only and placed in root folder Kindly help me resolve it.

@ChadThackray 2 жыл бұрын

Are you using a jupyter notebook? If so that sometimes causes errors with decouple. If you search for `Jupyter notebooks break decouple's default config`, there is a solution on github