www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Yahoo Finance Scraper

reply Selim <sozel wpi.edu> writes:
I wrote a small Yahoo finance scraper and wanted to share with 
the community. I have been using D for a while and I think 
contributing something to the community is good. There is an 
example main script and a unit test. Those should get you going. 
It currently saves the scraped data as a json file under the 
executable's folder. I might add a public method to access 
individual data columns inside json in the following days too.

All mistakes are my own and I appreciate any feedback.

https://github.com/SelimOzel/YahooMinerD

Best,
Selim
Jun 12
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Friday, 12 June 2020 at 18:22:28 UTC, Selim wrote:
 I wrote a small Yahoo finance scraper and wanted to share with 
 the community. I have been using D for a while and I think 
 contributing something to the community is good. There is an 
 example main script and a unit test. Those should get you 
 going. It currently saves the scraped data as a json file under 
 the executable's folder. I might add a public method to access 
 individual data columns inside json in the following days too.

 All mistakes are my own and I appreciate any feedback.

 https://github.com/SelimOzel/YahooMinerD

 Best,
 Selim
Thanks! There was a period there where you couldn't use the yahoo API, glad to see that people can use it again. I haven't run it myself yet, but I have a few comments for potential changes and enhancements. Why do you use a class for YahooMinerD (also the name YahooFinanceD might get more people to use)? It doesn't look like you are using any inheritance. I don't see any reason not to change to a struct and avoid new. It looks like you have a lot of writeln statements. While these could be helpful, they will also prevent those functions from ever being nogc. You could use an approach like below and give the user the opportunity to avoid the writelns and allow for attribute inference elsewhere. nogc void fooImpl(bool val)() if (val) { } void fooImpl(bool val)() if (!val) { import std.stdio: writeln; writeln("here"); } void foo(bool val = false)() { fooImpl!val; } nogc void main() { foo!true; } It looks like the primary way to get the data is from the WriteToJson, correct? What if you want to use the data without writing the JSON to file? For instance, I want to get the data and put it in my own database, or I just want to get the data, do some calculations, and then not save it. There should be a way to get the data out of there without writing to file. std.json can be used for parsing the JSON and there are other libraries out there. The WriteToJSON method should also allow writing events or prices without needing to write both. You might consider adding the ability to control the frequency (instead of just daily).
Jun 12
parent Selim <sozel wpi.edu> writes:
On Friday, 12 June 2020 at 19:10:06 UTC, jmh530 wrote:

 Why do you use a class for YahooMinerD (also the name 
 YahooFinanceD might get more people to use)? It doesn't look 
 like you are using any inheritance. I don't see any reason not 
 to change to a struct and avoid new.
Thanks! I liked the new name and updated it. Yeah, I realized that about classes and converted to struct. I didn't like the new there neither.
 It looks like you have a lot of writeln statements. While these 
 could be helpful, they will also prevent those functions from 
 ever being  nogc. You could use an approach like below and give 
 the user the opportunity to avoid the writelns and allow for 
 attribute inference elsewhere.
I implemented a framework quite similar to the one you described. I actually didn't realize how robust templates were in D until your comment. Logging stuff is now optional. My application code and test code have examples. I don't think MineImpl function can be nogc at this point but please correct me if I'm wrong.
 It looks like the primary way to get the data is from the 
 WriteToJson, correct? What if you want to use the data without 
 writing the JSON to file? For instance, I want to get the data 
 and put it in my own database, or I just want to get the data, 
 do some calculations, and then not save it. There should be a 
 way to get the data out of there without writing to file. 
 std.json can be used for parsing the JSON and there are other 
 libraries out there.
That's true. I added a data frame struct to save all corporate actions and prices. It can be accessed from the script. Should be quite easy to convert that into a csv or something else too.
 The WriteToJSON method should also allow writing events or 
 prices without needing to write both. You might consider adding 
 the ability to control the frequency (instead of just daily).
Added both of them! S
Jun 13
prev sibling parent reply Jan =?UTF-8?B?SMO2bmln?= <hrominium gmail.com> writes:
On Friday, 12 June 2020 at 18:22:28 UTC, Selim wrote:
 I wrote a small Yahoo finance scraper and wanted to share with 
 the community. I have been using D for a while and I think 
 contributing something to the community is good. There is an 
 example main script and a unit test. Those should get you 
 going. It currently saves the scraped data as a json file under 
 the executable's folder. I might add a public method to access 
 individual data columns inside json in the following days too.

 All mistakes are my own and I appreciate any feedback.

 https://github.com/SelimOzel/YahooMinerD

 Best,
 Selim
This could be a really cool tool to play with. For the writing out part, maybe the class should accept a function or a delegate, or some template hook, to write out the data, so the user can define it itself. Your WriteToJson could then be an example for that.
Jun 12
parent Selim <sozel wpi.edu> writes:
On Friday, 12 June 2020 at 20:29:57 UTC, Jan Hönig wrote:
 This could be a really cool tool to play with. For the writing 
 out part, maybe the class should accept a function or a 
 delegate, or some template hook, to write out the data, so the 
 user can define it itself. Your WriteToJson could then be an 
 example for that.
Thanks!! Let me know if you find any bugs in case you play with it. I wrote template classes for the write operation. One for writing to a data frame and another one for json. I think it should be relatively easy to bind that write function with mysql at this point. S
Jun 13