[ 3 / biz / cgl / ck / diy / fa / ic / jp / lit / sci / vr / vt ] [ index / top / reports ] [ become a patron ] [ status ]
2023-11: Warosu is now out of extended maintenance.

/biz/ - Business & Finance


View post   

File: 74 KB, 957x520, 1_Gytsn6MVvF7Sv9SUNmZRMg.png [View same] [iqdb] [saucenao] [google]
51529077 No.51529077 [Reply] [Original]

Going into Data and statistics, what's the better program to learn, R or Python?

>> No.51529129
File: 1.56 MB, 1024x1024, 1663096404674197.png [View same] [iqdb] [saucenao] [google]
51529129

>>51529077
if you learn python you should be able to learn R easily
so I vote python

>> No.51529131

I'd go for R

>> No.51529141

>>51529077
python
.t data engineer / quant dev for bond market maker whose company now sells MM algorithms to crypto CEXs
the main reasons are:
1.) python will have longer support as it does everything R does and more
2.) better documentation and community
3.) you won't be as siloed
----> you think you will break into data and stats, this is unlikely without a phd, shit is flat out hard. I say this as someone with a masters in data science from stanford.
I ended up "settling" for a data engineering role over statistics or data analysis.
they pay the same but mine is lower stress.
Also I am branching out learning new skills with python so I can change industries if i want

>> No.51529146

>>51529077
Just do excel

>> No.51529152

>>51529129

Yeah I've read Python is easier to learn than R

>>51529141
>python will have longer support as it does everything R does and more
I think R is specifically for data and statistics, whereas Python does everything?

>> No.51529155

>>51529077
Python

R is for smooth brains

>> No.51529159

>>51529141
I did a degree in comp science, is it worth doing a masters in data science?

>> No.51529160

>>51529146
>Just do excel

I know excel but companies want more now

>> No.51529238

>>51529129
>>51529131
>>51529141
>>51529146
>>51529152
>>51529155
>>51529159
>>51529160
actual data scientist that has been one prior to bootcamp memes
you need to learn python as its surpassed R as the lingua franca in statistics
HOWEVVEERRR
R is significantly more powerful when it comes to the variety of stats packages available to it
AS SUCHHHH
you need to learn how to export objects/tsv to R and have R deliver them back to python
this is how you win
>excel
excel is honestly god tier. copy and paste a big data frame, screen shot some bar plots and be done with it
anyone that thinks excel is beneath them is a bootcampfaggot

>> No.51529295

>>51529141
>>51529238
What crypto bags are you holding

>> No.51529338

>>51529077
R is a weird language, you can tell it’s not made by a computer scientist. But I would learn Python first and then R second, they’re both good to know.

>> No.51529342

>>51529295
shiba and doge unironically

>>51529077
https://automatetheboringstuff.com/

>> No.51529364

>>51529342
Didn't ask you, I asked the big brains from stanford and early data science.

>> No.51529394

>>51529077

Lol nigger

Do you have graduation from Princeton or MIT ?

Asking such questions lmao

Kys

>> No.51529446

>>51529152
> R is specifically for data and statistics
this doesn't mean that python can't do data and statistics better....

>>51529159
if you want to get into the field, yes.
But only go if you get into a top school, otherwise its a joke. The only thing youll get out of it will be the M.S. letters on a resume

>>51529238
this is bad advice
this individual either has had too much academia experience without much industry
or their industry experience is 5-10 years behind modern cutting edge practices

>>51529295
LINK, KDA, GNS, NO IA, dogbat, BAT
these are my main bags, the rest are less than 5% of my crypto portfolio

>> No.51529472

>>51529160
my best advice I can give you is learn some intro level python
then do some simple data collection from crypto exchanges using a websocket
first save the data to csv files (appending each message as it comes in)
then once you are comfortable transition it to putting incoming data into a clickhouse database.
Also learn postgress databases.
This will put you in a very good spot to figure out what you should learn based on your own interests
also give yourself an intro to web dev and computer networking even if you have no interest.
The most powerful advice I can give though is a short lecture series from MIT called
"my missing CS semester" or something like that
covers tips and tricks that are very important.
oh and get on a unix based OS (mac or linux)

>> No.51529548

>>51529077
I prefer R for anything that isn’t ML.
R makes it very natural to write vectorized operations, it has very concise syntax for manipulations, it is supported in most ecosystems just about as well as Python.

There is basically only one circumstance you absolutely need Python, and that is when you are running GPU calculations on ML

>> No.51529599

>>51529146
dangerously based

>> No.51529631

>>51529446
>this doesn't mean that python can't do data and statistics better....

That's pretty much what I e heard from the comparison. Python can do everything but R is better for stats and data

>> No.51529658

>>51529548
R is only easier for stats because thats how you learned it
if you learned it in python (via numpy and pandas) you'd find that easier.
The reason to use python is so you can have your full stack in one language.

>>51529599
if you can use excel you aren't working with that much data..
or you summarized alot of data to a condensed format that you would rather keep in python.
good luck automating plot generation in excel
or creating systematic jobs

>> No.51529675

>>51529446
R is certainly better for statistics than Python

>> No.51529688

>>51529631
see >>51529658
R is only "better" for stats because thats how professors teach stats (its moving to python for good reason)
R was geniunely better before 2010
but there is a reason that google who made tensorflow in python, chose python.
Why facebook who also creates open source statistical solutions chooses python.
Its because its the best overall unless you are doing homework problems
then R is fine

>> No.51529710

>>51529675
why
>captcha: NAW TX

>> No.51529737

>>51529675
also... if you are going to root for R you may as well just sell OP Julia
Julia is clearly superior to R for statistics (and python)
but I would not reccomend learning Julia first as python is the most well documentated langauge with the most youtube tutorials and you can branch outside of stats
TL:DR. dont learn R learn Julia but really just do python

>> No.51529849

>>51529446
Based stanford hyperbrain portfolio literally so smart it bends spacetime from the raw equivalent mass of your IQ. Don't think too hard or the world might literally stop functioning according to the laws of nature.
WGMI

>> No.51529864

>>51529737

I've been looking at Category Manager job postings and never seen one with Julia as the preferred program

>> No.51529982

Python is absolute dogshit compared to R.
You can tell python data analysis is for midwits because they call it „data science“ instead statistics.
R is tailor made for statistics and actual experts make packages, meanwhile doing stats with python is trying go jam a square peg in a round hole. And you have to hope your „open source“ modules arent maintained by some Indian who just stole it from someone else.

>> No.51530020

>>51529658
No, I taught myself R when I was doing ML genomics research.
Pandas and Numpy are very shit compared to R, that is just facts. R-base has a very intuitive manipulation scheme and it is only enhanced by phenomenal community support which bring your natural vectorized language down to C++ in easy to use packages.

That being said, Python is the only way to really develop any useful neural networks. Although honestly you don’t really need anything other than AutoKeras for most applications.

And everyone in this thread didn’t even mention sql. SQL is an absolute must have, nobody will touch you with a 20 ft pole without SQL, you can’t ever go cloud without it.

>> No.51530393

>>51529446
Based dobro gmi

>> No.51530549

>>51530020
>Pandas and Numpy are very shit compared to R,
not really, i started out my programming journey with R btw.
But the stanford CS department relies on python so I got more used to manipulating arrays via numpy. It takes a minute but once you get it it make alot more sense than R. The complication in python's numpy comes from the fact that it can do more. Overall R is "better for stats" than python, but if you are going to learn one language for stats you may as well focus on Julia as it is better than R (just smaller community to learn from).
Python is the recommended route because it houses so much functionality in one language. If you really need R after just learn Julia, kek
>autokeras
relies on tensorflow which has an innate memory leak issue, pass and use pytorch
>And everyone in this thread didn’t even mention sql.
I literally did without saying it, i recommended clickhouse and postgress (those two relational databases cover 99.9% of use cases).

But yes OP learn SQL. thing it SQL is a joke and you will basically master it if you learn pandas

>> No.51530595

>>51530549
>yes OP learn SQL.

I am. It's part of the curriculum. It the whether I learn Python or R part that is confusing me

>> No.51530628

>>51530595
python for the pure reason of not pigeon holing yourself.
you say you want to do data / stat but why limit yourself?
I interviewed with 20+ quant finance jobs and the only languages I was asked about were python, c++, and matlab.
R was not mentioned once

>> No.51530661

>>51529077
Python then R

>> No.51530678

>>51530628
I'm going into Category Management, which is more statistics and big data management. Some companies use R but like everyone has already said, Python is quickly taking over

>> No.51530686

>>51529159
Master's is required to take on basically any specialized role. Even though for the most part what you learn is useless for the actual jobs.

Seriously, other than nepotism the only way to get into the specialized fields in tech you either need that or have an insane catalog of projects you did to show you're skilled in it - basically reaching the point where you'd be able to sell your skills as a service for just as much money as the entry job anyway

>> No.51530753

>>51530686
Is a part-time msc online enough to break into these roles? Work is offering to pay for my msc in statistics but I'm not sure if an online part-time msc is respected enough to land me data science roles or not.

>> No.51530808

>>51530753
depends on the school
you dont have to tell anyone you are interviewing with that it was an online program, or part time...

>> No.51530841

>>51530753
Yeah just put the school name & degree

>> No.51530915

>>51530808
>>51530841
School is Russell Group in UK. Won't they notice that its an online program if they see that my work exp years coincide with my msc years? I'll definitely try to hide it but my question is would it make a noticeable difference if they knew?

>> No.51530926

>>51530915
If it's specifically an online school then yeah they'll obviously know

>> No.51531078

>>51530915
cut off a year of work or something
ex:
> job 1: 2018 --> 2020
> job 2: 2020 --> 2021
> Masters: 2021 --> 2022
even if you worked your job in 2022...

also some people work during full time masters, fucking psycho big briains, but i was in classes with SWEs from google when I was doing my masters. They were there mostly for the credential boost and to learn the newest ML techniques, not like they needed intro to python lol

>> No.51531326

>>51529077
you should learn both, retard

>> No.51531798

>>51531326
Obviously the best option but I need to learn one first

>> No.51531913

>>51529238
>excel is honestly god tier. copy and paste a big data frame
When compared to Python, Excel is not very handy when it comes to very big dataframes, not to mention filtering datasets. Also, it is not very good for custom graphs. Even matplotlib is better than Excel, when it comes to making customized graphs.

>>51529077
Just go with Python. R is super easy to pick after that. If you start with R, you will learn bad programming habits.

>> No.51531950

The fuck even is a product owner?

>> No.51531951

>>51531913
It seems like everyone is saying Python. Any good books to start with?

>> No.51533476

>>51529077
I used R for my PhD, I much prefer it for statistics. Python for machine learning and prototyping.

>> No.51533543

>>51529238
>excel is honestly god tier.
I can tell you're a pajeet working at some no name start up thinking that 5GB datasets is "big data"

>> No.51533583

>>51529077
I'd argue R gives you a bit of a better base to learn other languages because it doesn't rely on the fucking indentations to denote scopes. Fucking hate that shit in Python.
That said, Python will force you to adopt some better formatting habits.

>> No.51533612
File: 19 KB, 400x388, 1446497662916.jpg [View same] [iqdb] [saucenao] [google]
51533612

>university enforced stata
>literally noone uses it

>> No.51533625

>>51531951
Learn Python the Hard Way

>> No.51534101

>>51529141
>you think you will break into data and stats, this is unlikely without a phd
So not worth doing unless you're Uber smart? Was thinking about learning data and stats on my own.

>> No.51534158

>>51533625
With some app on my phone?

>> No.51534206

For a job in data crap: Python (and SQL!)

For some actual stats work: R

>> No.51534270

>>51534101
You can go that route, but you'll be up against the PhD's when it comes to applying for jobs.

I agree with >>51529141
Data Engineering is basically just moving data around. Think of it as the pick axes and shovels whereas Data Science is the actual gold.

>> No.51534306

>>51534270
>Data Engineering


Are we really gonna do this?

>> No.51534428

>>51534306
You're going to have to elucidate, bud.

>> No.51534488

>>51534101
no its worth doing, but dont expect to become some data analyst at faceberg
realistically you can get small company jobs, they will want the python proficency but in the end you will end up working with excel because your coworkers will send everything in excel

>>51534158
this >>51533625 is a book, google it. They had a website but it appears that youll be better off going to automatetheboringstuff d ot com (maybe mis spelled)

>>51534270
is absolutely correct, it is industry dependent.
However one thing, any good data scientist is also a competent data engineer. So learn the data engineering skills first then shoot for a data science position if thats what you really want

>> No.51534525

>>51534488
>but dont expect to become some data analyst at faceberg
Kewl. I really don't need that much so I'd be cool with that.

>> No.51534550

R isn't respected or recognized at all outside universities. Nobody cares.
So stick to python

>> No.51534871

>>51533612
Classic lule

>> No.51534931

>>51529077
I know both, but I only use R for data science stuff. But I am a biochemist by education and not an informatician. Which one is prefered depends on industry and employer.

>>51529238
>excel is honestly god tier
I have worked with experiments that had ~300 000 data points with at least 12 variables per data point. Working with excel is an absolute nightmare with that amount of data. 100% nightmare. From general stability to any type of multivariate data analysis.

>> No.51535062

>>51534488
>realistically you can get small company jobs, they will want the python proficency but in the end you will end up working with excel because your coworkers will send everything in excel

If working as a data analyst that will happen sure.

>any good data scientist is also a competent data engineer

Debatable but it's a good idea as a DS to be aware of several things related to Data Engineering and of course to know SQL up to an intermediate level.

>So learn the data engineering skills first then shoot for a data science position if thats what you really want

I see your point here but what's going to happen is that you'll be pigeonholed into Data Eng. work if you show that you're more competent on those skills. If you're fine with that cool but if you want to actually do more applied stats and modelling work that's going to backfire...

>> No.51536440

>>51534871
This was a highly accredited MSc in Finance 2 years ago lol

>> No.51536695
File: 408 KB, 825x426, wojakindex.png [View same] [iqdb] [saucenao] [google]
51536695

>>51529077
Data Scientist here. We all use R in our organization, but we integrate it into our DB using ODBC. That's the hint I will give you. You want to know how to use full stack R, that is the power move. Like another anon said, Python will let you go up and down the stack in one language, but the scikit-learn shit is not as robust as R's autistic libraries.

>> No.51536923

They make us use SAS at my job lmao kill me
But it was my Python skills that got me the job
I was also a psych major, I got in because I had a decent portfolio showcasing my abilities.
However a lot of places showed me the door because I didnt have SQL job experience, even with certs

>> No.51537416

>>51534931
working with excel even on small datasets
>muh non-utf8 encoding xD
>muh limited export options xD
>you have to use the software like WE, your gods and masters, intend
>if you want anything done you can write VBA code ^___^
and I'm a goddamn forester, I drive 5h a day in a car that reeks of sweat. R/Python are the way. Death to m$.

>> No.51537452

>>51529077
python (learn numpy, pandas and scipy also, learning tensorflow and/or pytorch will also be a big help)
R is a dying lang, but if you've learned python and get an R job you'll pick it up in no time so don't worry

>> No.51537901

So Learn Python the hard way has a 4th edition but Amazon only has the 3rd edition. Is there much of a difference between the two?

>> No.51538409

>>51537901
no but i bet if you went to libgen dot io you could probably grab it for free :) (and be able to copy paste which kindof defeats the purpose of the book... lol)

>> No.51538426

>>51537901
Dude fuck a book. Think of some project you want to accomplish and dive right into it. Its really not that hard: its like using an application only instead of using buttons you write sentences. Python is leaps and bounds easier than other languages in this regard.

Idea: write a program that scrapes a 4Chan thread and returns every post with repeating digits. Read about the 4Chan api and read somewhere on the internet how to work with JSON.

https://youtu.be/vmEHCJofslg
This is single video that got me started. If you try to read your way into programming it'll just be an abstract thing. You need to figure out right away its not that hard. Download jupyter notebook, it'll make trying through failure so much easier

>> No.51538485

>/biz/ knows more about coding than /g/