Machine Learning with Python | Machine Learning Tutorial for Beginners | Machine Learning Tutorial

with python being so easy to learn and having a really powerful capabilities it seems to be the dominant choice for new learners and industry professionals alike Machine learning and fight on like peanut butter and jelly and unbeatable match You don't even have to have much experience in the technical industry to learn either off these topics with these girls and had many people go on to work and companies like Netflix Goldman socks on delight I see these after daughters like us is enough to get you started on your journey to becoming proficient in machine learning and fight on Who knows you might be the next industry spot This course consists off different aspects off machine learning as well a spite on So we have got three amazing faculty members experts in their own right to help you get started with this discipline in a comprehensive manner The Doctor of banana circa one off the top 10 data science academicians in India open are another panini And I suppose Dr Fellow and Mr Graham a big data expert Now before we start off with the session and life to inform you that we will be coming up with CDs of high quality tutorials on artificial intelligence it's assigns and so much more So don't forget to subscribe to our channel and hit the bell icon to get a notification every time we uploaded a new video If you like this tutorial don't forget to click on the thumbs up icon and comin down below what you'd like to see next Yeah other topics we will cover in this tutorial First we will get you to dip your toes in some programming with an introduction to bite on an anaconda Then we've covered different libraries in part on including pandas and Mumbai for data processing on manipulation After that we shall see how we can visualize daytime pipe on using libraries like my problem and see bond Well then get into a statistical approach understanding how statistics is different from machine learning on the different types of statistics on understanding of data how it is distributed how it gave me measure and how it can be represented We'll get you right on track toe handle machine learning algorithms that come your way moving forward You will be introduced studio enforcement learning its principles and it's tremble Then we will learn about Q tables Cure learning algorithms on a case study of a smart actually to help you on the send reinforcement Learning better Let's give introduced to the details off this tutorial Dr Abby Rana Sarkar received his PhD in statistics from Stanford University Here Start applied mathematics at the Massachusetts Institute of Technology He has also been on the research stuff That idea The circle has also led quality engineering development and analgesics Function of G and has go forward Go mix labs Ah biotechnology starter not the Nana DARPA enemy has more than 11 years off industry experience and over four years academic research experience Over these years he has worked as a professor trader project manager team leader and development He has completed this PhD in mathematics from Pierre and Marie Curie University on his M A C and B SC mathematics from Pondicherry University and She Krishna They were I University respectively He was also a postdoctoral fellow at the Institute of Science His areas of expertise include machine learning optimization financial engineering and high frequency and I build them a trading He has published over 10 research articles in international journals and conferences This is a good woman is a big data and AWS expert with over a decade off training and consulting experience in AWS and about 100 people systems like a party spot He has an impressive resume having worked with global clients like IBM Cap Gemini Headsail and the proof to talk it all off He has worked with many Bay Area start ups in the U S With that let's get into this machine learning with python cause it's time for some great learning So like registry said my name is Rocco So I have been working for about 12 years as off now Um and I started as a developer Java developer basically and then moved into something called Big Data So I have been a big data developer for quite some time and then more commission learning because industry changes like you can't stick to hunting And I also a peach and practice a bit of cloud computing with Amazon Web services Right So that's what I do Onda Currently I work as a consultant for some of the companies I held them in solving their problems and you know giving them idolizes Esther water tow extra extra And I also trained people so it great lakes as off Now most of my training sessions are in tow Python that's what we're going to do And statistics as well as Emily So these other areas where I usually train in Great Lakes we also have ah Specter separate program called Big Data Own Cloud which is also I'm also one of the vendors for that program That's that's a totally different program not your program But I was one of the quarter for that program and also a mentor for that program So there'll be mostly little bit cloud another kind of technologies right Um well so let's get into what we're going to do right So the idea is this is called Introduction to Fight on for Emel So I think from the industry session you would have got some Miley about You know what will be the future right What you're going to learn over the period of time So see when somebody pick up the track off machine learning on dhe No start Oh learn what is machine learning and all Um it's It's really important that you need a programming language that is for sure Right So whatever you want to do you depend on a programming language And there are many languages like you you could have heard about are there is language called art and sass and we also have a fight on So we will be using pi turn throughout the course Right So on you may ask like worry Are you like teaching using fight on Why don't you just use our or any other language Well by tone as such has some advantages Because if you look at the language like our for example that is only used for this data analytics Kindof jobs less by Tony is a universal language You know you can have ah very development in python or you can do any sort of programming invite on not only related to your mission learning So then when you go to the industry and when you start working with real projects right so maybe you are doing some sort of animal project But that project also has a part where you know you're you're probably building of a publication or making an AP I call or something like that So they're fighting will become handy So you learn one language which can be used in many places actually and that is why we are using by Thorn basically And the second reason obviously is that it's sort of like very easy to learn So many people are like confused I'm not from a programming background and also probably my programming is not so good as the other guy So So fighting is very easy to learn compared to other languages Andi if you're comparing it something like job or something it's pretty easy right because I have been working as a job I level up so basically my skill cities Java Scala fight on So I also work on Scalia which is one of my primary skill set and then bite on So when I compare this together than you know Learning scholar or Java is very difficult for you So that is why we have Pie Thorne as a primary language and also by tone has a support for a rich set off libraries It's probably I think the person could have explained to you so and what we will be doing today and tomorrow is we're going to learn practically so there is no point in you know talking about the history of python and what is buy time for four hours Nobody's going to learn anything End of the day So I also have slice but I think even that it's pointless Like you can just run some slights and explain But some concepts are important but majority off the thing we'll be doing practically Andi Uh did you get a chance to go through the El Emma's Onda See What is there for the prerequisite or something like that Have you guys went through the dilemmas by any chance before coming to the class Yeah So there is a fight on Prerequisite I don't know how many of you have going through that So that is the first thing that you need to do is that you have to install something called Anaconda on DDE Have you installed it actually Yeah Okay that's good So see by phone is open source by the way So you can just goto the bite on Web site and download python That's one way off working with python But that is not really useful because if you're downloading the open source fight on just original python It will not commit any libraries for you So usually what we do is that we prefer this distribution off by porn by Thorne called Anaconda So Anna corn guys nothing but a distribution off pipe on So basically when you're down Lord and install you will be getting pie turn You will also be getting some off the libraries which are very commonly used with spite on And it is very easy to have date And if you want to add any other libraries you can do it in an acronym recognize one of the most famous distribution off by Thorne Um we also have another thing called and thought cannot be There is something or cannot be That's another distribution Well can obvious Not very popular like an acronym but I have also they're similar I mean the only difference is that the name and the project but and off the field down Lord cannot you also you they fight on same thing right now This is like a distribution which comes with I'll show you certain tours inside that for so one thing you will be getting in spite on for sure Then you will be getting the library's later to bite on Then you will get an I D For python like eclipse You have four job right So very similar to that you have Ah I record a spider for python So if you're using an acronym you'll get begetting spider and some debugging tours extra like that but end off the day They're all open source It's not like any preparatory thing that you're using but it makes it easy for youto down Lord and in start So we'll be using an AK wander forever All the hands on practice is so This is a prerequisite youto in style in a corner on dhe Then I will show you what to do Once you have install it how to test it and I will show you for something on Then we will be using three I don't think this requires any explanation but we have a version called Python too which is an old version I don't to uh by 12 4013 was officially you know removed from the community not remove The support was remote So if I don't it would no longer exist actually But in some of the older projects We still see if I turn to people writing the court What were you saying Despite on three there is no major different message some small syntax differences and performance improvements in my country But everywhere people followed by country these days right And now how are we going to learn python So one thing you need to understand is that you are going to learn by Thorne in the aspect off data signs on machine learning You are not going to learn fight on to create a website for example That's different right So you will be learning by turns specifically for data signs or machine learning which means when you start learning python there are certain libraries and bite on which you might have to make you So there is a library called numb pie something called them by And then there is something called pandas And then there is something called Seaborn Now you don't have to really remember them We'll cover them anyway And then there is something called psych it learn on door So what are these things These are our libraries which you can download and install along with Python And the good news is that when you download and install an AC Wanda all this will come by before like free So we don't have to separate Lee go and download them right on in this So why So this is actually your domain for the rest of the year You will be playing here So this is very over actually work O r At least 70 to 80% off The time I'm talking on labor Tamil when you go deep learning and a little bit different but usually the male side This is where people play around right And in this So what are these things Right So numb pie is a library which is called a numerical pi Turn on This is used for you know things like numerical data processing For example I can have things like Tour Dimension ladies three dimension ladies multi dimensional I resent also mostly for scientific calculations and all you need to represent your data in the form of vanity So you know what is an idea Right And ira is just a collection off numbers I can just say 1234 This is ordinary Well this is technically a list but you know so something like this is an airy actually a collection off numbers Now this is one dimensional right I can also have two dimensional and three dimensional and even more than three dimensional wears off representing the data And if my project requires that my date on used to be represented in that fashion and then I have to do some manipulation on my later we use the library called and number so we will express our number and you will understand how this is happening And if you're thinking about this image processing and or it's you we'll learn later But I imagine you want to do something like image processing So end of the day what happens when you read any minute or something it will be converted orbit sender Bite's basically right So you know that right So if I have a J pick file I will read it as an image But when I wantto process it you know I cannot just process any maze What I will just I'll convert into numbers Basically one sensors So and then I'll be using this number I I raise to lure that data and then no matching and classifications or whatever you want to do So Nam pirates and all are very important when it comes to your deep learning image classifications Some off those contexts Maybe not immediately You will not work on um pie but a basic knowledge and dump I will be actually helpful for you in the future All right so this is one thing You okay S o I will probably explain that So naturally the question will be that how does the fourth dimension looked like Are you Christopher Nolan Right I'm obviously so Yeah So obviously people who last right So tour dimension You know what it is What about three dimension that also people can imagine What about Ford I mentioned right So then their discussion goes to a different track Or but yes you can you can actually visualize four dimensions I can show that Actually it's not a big deal It's not rocket science It's very simple But in some of the scientific calculations we need the data in the form of four dimensions and then you additions and subtractions on the data So you need a different type off data structure to represent that you're normal The infrastructure's cannot define four dimensions or fire damages That is when you say we'll use number I basically right And in your projects or upcoming sessions this will be explained for that whenever it comes to the point off number And this pandas this is what we will be starting today over analysis with It's called pandas Okay so pandas is a library for sort off like label data analysis So you guys are family with excel right Excel sheets So the same thing in half by Tony Span does actually you have labeled data like rows and columns Okay And then you want to process the data So I have probably a data set with a 1,000,000 rose or 10 million rose and I want to find out something from that So probably if I use Excel It may not be a good idea when normal calculations excellence Fine But if my date as it is like you know 10 million 20 million rose off later than that actually makes ah problem So we had a customer I was working with this Most of his bins Mark Right So they were enough for customers from Germany So Mark actually what they do all their cars are having this sensor thing So they make your cards right So if you look at that most of us been says class it is one of the safest cars in the world Why Because it has around 200 different sensors on the car So what they do they collect the sensor data and the car is moving It'll track the acceleration break and pedestrians and everything they will track using the car So this sensor data they collect So imagine the car is moving like if I'm traveling for are gonna probably three hours it'll collect the data So this is text data and the seven terabytes that extra later that you get normal text will be in terabytes So like more than billions off rose But but we can't help it right Because sensors will keep on every second or even less than that will keep on tracking the motion off the car what it is doing and they have to collect it And end of the day Their problem is like I want to analyze it and then probably make a mission learning mortal So if I'm giving you such a file for example I'm giving your text file for a CS refile But I'm saying that there are you know 100 billion rose Obviously you cannot Lord in an Excel sheet it doesn't even work that way So that dying we come to this spandex spandex is one of the libraries which helps you to know that there are other libraries so super spend as its most most common So very well with the label the data you have rows and columns on Then you can have selection filtering indexing grouping joints anything that you normally do on a data set So we'll be spending a lot of time on pandas message So there's one place where we need to spend Ah math blood lip This is used for visualization like I want to draw a graph So it's it's very important to tell a story to your customers Right So when you have a customer so he will give you some data on and then you want to talk to a customer so you will analyze the data and usually what we do is that it's very rare that we used mad plot live directly We use the library course Eban The Seaborne is built on top of matter Basically So this is that ivy actually used for visualization where you can have nice graphs bar charts pie charts or scattered diagrams or box floors or any type of flirting you want to do But then you will be thinking that Hey this is very similar Even I can do what Plotting And also how do you normally if I ask youto write their plot or create what doors will you use in apart from python in your debt Excel or or probably PowerPoint or something you will use right But how about the difference here is that these libraries the Seaborne Mac lordly they allow you to do statistical plotting For example I have ah set off data I wantto calculate the mean median and standard deviation and based on that I want a plot or I want to include those properties Then they will alone So it's not like a normal X Y plot that anybody can do I don't need my quarterly for that So these two libraries have to do a bit with your status Take a side Now The problem is that I will cover a Seaborne tomorrow for sure But actually in order to completely in distance Seaborn you need tohave some status six base your status It's class will start only tomorrow afternoon My classes in the morning actually So birdie will cover it so that you will understand what it is and also visualizations Initially you will not be touching a lot as you proceed You know the respective topics that will be covered What sort Official decision you're using So these two are basically visualization and psychic learners where you apply actual machine running So cycle learn is your primary library for all the ml algorithms that you will be using All right so after the next month or so when you actually start learning machine learning more stuff your machine learning algorithms are inside this library Course I could learn So I will not be touching psychic learn because that is the whole course Actually So I really my job will be to cover basically pandas mostly on then Matt for clemency born in one and a bit off Nam pie And that's that is a job given for me to at least give you a basic idea on again If you look at say something like pandas the top because very exhausted So it is not like I'm gonna teach you everything about pandas in front Three hours or four were stacks Impossible Right So that way if you do this is like I will be covering the basics off All off this today and tomorrow and Sunday you have a practice session So normally we have a practice where you know somebody will be coming to assist you So you will have some questions to be sold and answers will be there and you practice And then the good news is like your next class will be after a month And that class will be a statistical class not a python Plus Then the technicals last will set This is also technically the next class will be after two months Actually when you start Ml so you will get around two months to you know go through where they have taught you Basically night So So that we give you enough time to pick up items Somebody is thinking that Okay I need some time to learn by tone or get some idea you will get a good amount of time to pick it up Right Um and Yeah Study huh Yeah So one thing is that so I don't want you to immediately pick it up So not normally trainers will not say like this But you know I don't want you like today you go go home and like should I wanna learn by turn for three hours now And generally I'm gonna talk to anybody All right So it's not like that right So you can't learn it that way So today and tomorrow mostly I want you to pay attention and understand what we're doing You are not really learning anything right And then the Sunday class will give you some more idea Then I'm gonna share you set off assignments and practice and solutions which you must do And I'm also gonna share some learning my TV over the period to cover All right so the idea is like you are all working professionals So you spend some time every day probably you know half another 15 minutes That's more than enough for youto get prepared for python And of course in the ml classes spite and will be used in a different way So in a male classes you will be having pandas for sure But then pandas issues Okay I forgot So in the last class there was a small confusion So I take regular classes But see this whole thing What we're doing is called E T And I don't want you to forget this name There is a reason behind this It's God exploration Relate analytics Why This is funny actually because in the last fight don't last I thought all this but I didn't say this name in the next last The trainer said Okay all of your completed the idea They said no trainer did notice a single line of fury and other They got confused He immediately called me What did you do for eight hours Did you actually skip your class I'm gonna check the recording So No I didn't skip I thought all this No the participants suffering They didn't even learn us bit off early So that so Keep remember then if somebody asks you say you learn at least see I'm also doing a job right So So great learning is I will get this great king last time I know what he was in the class but what he was doing we should probably check the recording Since then a bad had happened Actually no I mean it had matters sometimes you know So this is called TV Why this is called led because usually in a typical machine learning project what do you use that you first collect the later whatever data you have on dhe Then you do this part called Edie Exploration Relate analytics Um there is a slight Probably I will show you which will be very consoling for many off you because ah lot off people were concerned that you know I have to do a lot of programming now So after learning all this ML when I get a job out of state and program so there is a slide of riches nor justice life around a 70 to 80% of the time in a mill project You're not really sitting and programming You're just trying to understand your data And that's true I mean I'm not making this up So when you start animal project the problem is like so we had a project on the New York Children's Hospital We were working on a project from the New York Children's Hospital So they're dead It's a Children's hospital right So they were trying to detect early cancer in Children That is encouraging The end goal is like you showed up You know I figured out if if given a patient what is a probability this patient will get cancer or not So that is end goal off this project actually so you know this is the project and we know Okay this is what you need to do But the real challenge is that if I need to do this what is the data that I'm going to look at All right so then you start understanding So first thing is that you understand that don't mean because I had no clue about the healthcare or mean or specific to cancer So then you understand You know some aspects off the domain Like what is cancer What causes it What are the factors which influences it That's a actually influences it So this will take roughly around 1 to 2 months to understand the domain and the features off our data Then you talk to the hospital and the doctors and try to collect the later so initially you may not get enough data You will just get the patient records which will have their age weight and statistics and then you ask for more data So the more data you get the more features you can get from the data on that will impact the normal mortal So actually doing the same Ellis Not a very big deal as you have thinking about because you haven't I'll go to them to do this right So probably in the morning you would ever discussed So for every machine learning problem we have an algorithm which can solve the problem Right So all you need to do is call the I'll go to them give the date and go to them and give you that assault It's probably 10 minutes job really All right so it's not like very complicated But before you call the algorithm you should have the right date on the right format to feel good Then it's going to give you the expected result The majority off the time the later scientists are spending in you know understanding the data and seeing if I can collect other varieties off data than what features I can extract So initially we were getting only patient records okay And then we build a model that was not very powerful was giving the sense but it was not really what the you know center was expecting And then we had further discussions on Dhe Then they said we will also share some more data Okay which is not directly related to cancer But some of the things which may affect For example we started looking at jean patterns and all For example if so it is a started which is going on which proves that it can be hereditary cancer It's not hereditary actually But what is that Flicked off that All right so then that is a whole different question So now you need that kind of for data Where you another person's heritage Really It is his G genome data Father Mother beatings and or this What This is the history of cancer in the family and I So then how do you represent that data So this is a majority of the time you're playing with the data that is this CD apart So once your data is finalized and you have an idea Okay so this is what I want calling the ml algorithm It's not a big deal Actually everything is already return If I'm calling a regression algorithm it's already written I'm not writing anything just passing the date I'm getting now But then I validate his disconnect What I'm doing So if that is not correct then I need to again rebuild my data and then trained my model And Oh So that is why this idiot part is very important because you have to collect the data and then look at the data on extract certain features on the later and then compare it That is where this idea becomes very useful These are not the only the imitates There are further methods as well which you will cover later This is some of the basic idea things that usually Ah so that project actually got on hold but we went up to 96 97% accuracy levels in there So like now that model is not training But we were able to ah you know achieve I was not actively a contributor in that project But I was helping them since that was a new project actually are you able to get a good accuracy 11 in that Actually so it depends on what kind of problem you are solving right Each problem is actually different for him Andi Like I said dub So what we will do We will just look at the basics off bite on a little bit and then we will go toe pandas and we will do some hands on with pandas night and then Nam pie and then this visualization stuff right So I think possibly I should uh you know it started with the hands on part So if you can look at your laptops right So today probably what we will do is like you can go along with me if you want Maybe tomorrow I will do it myself because it also takes time right So maybe depending on what we need to cover sometimes to motor what I will do I will just demonstrate and you can practice later Ls that will not be a challenge So if you haven't installed Anaconda right um you can just saw us for Anaconda This see there is something called an acronym navigated and you can just open it just just to make sure that things are you know working fine Well Jupiter is in fact not an I d e a spider is the i d E will be using Jupiter for sure I'm inspired We don't you spider at all for camel projects Actually I'll show you So once you start you Anaconda and one more thing you can do is that if you're not really very much comfortable programming I mean you may be somehow comfortable but if you feel like you're not really comfortable program you can just take a help for somebody who's sitting next to you right Yeah So if you open and a corner you will see these icons on in this icons We are really interested only in this thing called Jupiter Nor does the second thing that you see there are many things Actually we are not really using any off them For most of the data science projects Jupiter is the primary you know building platform for prototyping And at least now when you begin and I bent once you completely three months six months and all Probably you can use different or sores but to get started What you need is this thing called Jupiter So either you can click on the launch buttons here One thing there is a launch button or even it will be available here If you go to the program's just by Jove Peterson Murder Maybe there Yeah that is your Peter notebook So whichever way either you click on this launch or click on this Just open this thing called Jupiter It will open a browser I will tell you what it is It's opening night But I was that Well what is this Right So Jupiter is actually on open source project or is an elite was called I put it on our book later It was renamed as Jupiter nor book It is ah browser based interactive shell for fight on meaning Normally if you're writing a program what you do is that either we will take our next part or something or he would open the command prompt and fight the program So if you're doing something like job or something you lose eclipse or something I in case off fightin since it is sort of like a scripting kind off a language It's very easy to write the court What Jupiter allows you to is to create something called notebook A notebook is an environment where you can type your court run the court see the output and the advantages that you can share it with others also So just tow show you an example We will pick up some notebooks but you guys can't do this You can say new There's a new button and there is something or bite on three you click on despite on three ideally in a different lab this will open So why didn't you just go here Say new fight on three and something like this will open night Are you able to know this Yeah Okay so this is called a notebook Now if you really want to know what this guy's doing it's very simple For example I can say a quilt of five be Quito 10 Okay And I can simply say orange A plus B Think probably you can understand the court snorts are difficult So let's say I'm writing a court like this Okay so I'm saying that is a variable eh That's five bees then and I just wantto bring Dave speed Right So this is my program Imagine now if I wantto executor my program what I can do either I can click on this run button There's a button run or I can just press shift ended Just run the Pro Accord and show me that book This is basically what in this So it is an interactive notebook You can type and click and see what you're typing or what it is going to do And for most off your our data signs or am L projects you'll be using this tool It was in ML It's very important that in every step you see what's happening All right so you run right a fruit cord and then burn it But rather you Lord the greater than See what is the rate up Filter it see what it is So the most convenient way to work that is using Jupiter shall be using Jupiter all the way All right Yes So um some off You're having small difficulties and starting night by Thorne and all It's perfectly fine even if you're not able to do it right now It's perfectly fine So anyway I'll be sharing whatever I'm doing over there limits by now in the elements you have Ah file Can you look in this file if you go to the Olympus Um let me just Can you see these files and elements There is a python overview fight on visualization store sales Uber driver Exeter Um we're in a limits They will be a zip file for bite on files Do you see Is it filed for bite on files Can you download it and extracted inside that You will have a zip file right Can you download it and extract it on De So once you've downloaded extracted and you should see these files So basically what you need to do once you download them opening a Jupiter So there's my Jupiter Right And all you need to do is let me see if it is already there No it is not there Only just picking this up Your button There's enough Lord Burton Okay On dhe Then select the file you downloaded Let me show you Which one Uh where is it JJ by Thorne files and then select this file called by turn overview Can you see and say open and taken up yours Mexico So what do you need to go click on Um you know this icon called up Lord You call this a floor and upload this python overview Fine Now I have seen some of the classes for some participants When they tried to do this it will not get uploaded And the issue was with the browser So sometimes if you're using Internet Explorer it will not I love you but I don't know why for that reason but And if you have uploaded it successfully you should see it here like this on my screen Somebody in this form So if you just click on this file but don't overdo it should open like this You know you should see this actually Right So this is the advantage of a notebook C You can type your court and you can even tight the explanation See So I have type something like core structure What are you learning You know all these things and this is not actually called This is just the markup right that I'm doing So I can even types these kind of explanations and share with people It's very easy right And I want you to do one thing Once you open this goto this cell may know there is a cell can you see And there is something called all our put clear the last option because normally when you create a notebook it will have some outputs already So I just want you to clear it It's just for to this cell menu all our put and sickly Now you can also insert your own Now you know cells So basically how the notebook works you can see different different sells itself can be executed independently And you can also write your own cells if you want right And this is a core structure So first we will look at some of the native data types and fight on which are important for you on We will look at pandas and basic data from attributes and common data manipulation tasks using pandas Then there's loops and functions visualization any other miscellaneous tropics right And what I want you to do first is to run this cell under this basic data types You have a cell here so either you can press shift ender But you can click on Mr Ron Burton And how do you know whether it is actually running You will see this number one here Can you see this means that has signed Okay So that that is only there to identify whether that cell has executed for distance And exactly on what is this So basically I'm just doing a small import here So I'm saying that from my python import something called interactive shell on interactive shell cord interactively or I will explain this So basically what I'm doing is that I'm just telling that I want to work interactively on the I fight or notebook So we just need to run so standard the court No boilerplate kind of court We never were interacting that I pattern I will explain what this court means to you in a moment but just run this and make sure you can see this It is actually running right No Actually there is some court here but you can insert your own self so I don't want to just run what you can do that you can say insert and you can say sell below And what will happen You'll get a stand like this all right And you cannot just like what you wantto type That's up to you So after you run the first cell I just want you to insert a cell at in Python At least for the time being you need to know only three basic later types So if you put a hash it will not work commending his hash and by dawn you know right So if you're type a hash and pipe any cortical not actually execute So that is what I'm typing A hash So basically invite Thorne the three later types you should know our one is called a list Then there is something called a dictionary And then there is something called a toupee So what I'm saying is that list dictionary and people are the creator types you should know at least for you Emma Land or these kind of things Mostly there are other data that's also invite thrown off course There are a number of writer types but these three things are very important because these three days our lives can hold your data You know like more than one piece So all these have collections kind of data And the first thing that typically you should know is called a list So what is a list A list It's nothing but a collection off elements on How do you create a list ISS like this for example I can say something like this my list equal to let's say 2030 You guys can also try with me if you want 70 That's it So how do you identify something is the list is the square bracket So that is the thing you need to understand If you see a Squire bracket it's a list the light on So I just created a list called my list and you say equal to and then you just drive this or this creates a list Now in case off your typewriter notebook if you wantto run something So I have this I can just say run and it should just create the list for me So now the list is created for me on If you want to see this you can simply fight the name I can simply say my list uh my list and run this again and it should Clinton So this proves that the list is actually created So whenever we want to put in something in the Nord book you don't have to explicitly type print the last Linus anyway printed So you may be wondering that normally when you want to print something or separate and then the this thing and I put it on our book the last line or any variable you just right it'll printed by four So you can easily see that And I So first thing you need to understand is that this is a list You see the Squire bracket thing right And also if you wantto access any element from the list you will always say my list Okay I'll just probably pipe it here and you can simply say this notation zero So what will happen You can access the elements using the index position so zero will be the first element It'll print 20 All right And if I say is the row coma one what will happen now it will not bring so you cannot access like that Okay so yeah so I'll tell you But so if you want to just print one particular element you can say like three So this will bring word 50 because 0123 Now if you want a range of elements for example I wantto print the 20 and 30 All right you can simply say I can say something like the road So when you says the photo to what it basically means is that I want to Prince Road and the first element Nor does 2nd 1 No nor the position too So if I says the road to export apparent 2030 if I says roto three were little print 2030 40 All right so this is one way off Accessing you know the elements within the list You always use this colon notation and say that which elements or which range you want to actually print on This notation is very common in bite on This is not only in list Even if you go to data pandas and or we'll be using this colon something from something It'll start bringing those elements Okay Now what if I want only 20 and 40 Armand everything right I want only 20 and 40 So during these position is work zero Then what Three I don't know You get it No So take it as an assignment I'll explain this but yeah So normally when you wantto print a series off element you will say like this the rotors Three The small print except 2030 40 I'll tell you how to parent individual elements or you guys can figure it out figure it out and tell me yourself some in class assignment for that right So my question is that how do you print Wendy And only 50 If you want you can try and let me know Um anybody got any answer Then It is easy for me The local movie No list in this is must be in digest So slices not to which is throwing us a matter Right It is not ending right home OK ok OK separately No no I just want uh okay so zero in a square bracket Goma Then what is that Three Right Sorry Remove the external decades Military next No that doesn't work Right But it's a small trick I'll tell you Let you know Okay so as off now they're just printing Siri's Okay I mean emptiness elements The road to three I'll tell you how to print individual one stone Very Okay you can But it's very rare that normally in a list kind of a category it's very rare that you pick up individual elements because normally if you're creating a list that will have like one million elements and started and it's very rare that I say I want only the third and fifth you mention a Ranger filaments Actually you can also do this You can simply say for example one corland And what that does is that if I say one and just Colon it picks up 30 that is the first element and gives you all the things after 30 30 40 50 60 70 I can also say to call And if I want natural give me 40 50 60 So you have this nice place off accessing elements within the list whichever way you want But I So that is we will look at least more Don't worry There is a simple example off a list on here or so we have created a list Now my question Don't run this Okay My question to you is that will this actually work So here I am saying X equals toe one Then be So what is the specialty of this list It is a friend Does it actually work No Yes I don't I'm just asking Yeah it works so but list can contain different type of elements Okay but I really We keep similar type of elements Now apart from the list thing in search sell below Another thing which is very interesting is a dictionary So I can say something like big equal to And how do you ah create a dictionary is with a curly braces So whenever you see a curly braces that is a dictionary Okay And what is a dictionary or dictionary is a collection off key value pairs for example I can say it's cool I'm a trainer Right on Ben You guys right You are you guys I don't know You are watched participants right Sonny on I can also mix them For example I can have another key For example there's a kick or three and I can say this is something like my number So basically what is a dictionary The idea is that you represented using curly braces First point on it always has a key and value structure For example Here Raghu is the key on trainer is a value again You is the key and participants is the value Three is the key My number is the value Now inside a dictionary there is no strict rule as toe What should be the key What should be the value that's for you to define You can have strings Who can have indigenous inside a dictionary But one thing is that when you want to access the elements off a dictionary normally analyst uses the road 1st 2nd here you will use the keys to access the elements For example You want something you can simply So So what is the name of the dictionary Ah the a c t Right So the dictionary name is the A C T and I can simply say something like this So if I want to get say what is the value off Raghu I can simply say Call the key Rocco and you will get trainer or or any key for that matter So I can say Give me the value for three and you will get my number So basically this is used like a lookup table sort of like a hash table where you have six or number off key value pairs You can just call the key and you will get values You can also print things like this For example I can save dicked dart keys Sodi Yeah So if I just ask for the keys it's gonna give me all the keys What are there in the key You can also You should also be able to do our dicked dot rat juice Sorry No more of this missing Yeah So basically if you want if you get a dictionary and you just want to know where the keys are values in so that you can discolor darkies and dark values it's gonna give you the keys and volumes and then you can access which other values using the key Now individually you can go like this But if you're interested in accessing arrange off he's like I want 100 keys Then I really you should go for uh four loop or something So typically if you wantto I trade through a dictionary and get different different values We use a four loop I'll show you So usually you say that four elements and then one by one I want all the values or something like that s so that's how you look at a dictionary Now we are not really dealing with object oriented programming Share So having affair coming from the hoops background object oriented thing Because why I'm talking about this Is that usually in your data science or animal side we don't normally deal with object oriented concepts But by Thorne is a language that supports object oriented programming It isn't hopes language It's anomaly you have an object on Then you will have methods And after your soul the object Right So that is what When I'm saying this these are functions on the object actually Right I'm saying that Biggs brackets right So basically you can have function calls and then get the value whatever you want So they start defined on the A dictionary actually so we will see them more so directly You will not be dealing with classes or objects A large and python But you may sometimes use some off the built in matters like this right now Another interesting thing that you can do this Um let me just in search a cell below so I can say the same thing right So I can have our dictionary where you know I can have the keys like this So let's say I'm picking up you guys Okay So you had a student okay On you were working in IBM right You were working in IBM and then there is another strange This underscores too On this student WAAS working in multiple companies So they say in fee Uh I don't know The ad and uh Silicon Valley off India I gave me some company names in Fiji Then what I don't lean fear maybe in Miami Like this Uh the pro No Be more creative right Give me something more So what I did here is that I created a dictionary and there is a key and value the value can be anything So in this example the value is a list right The value is a list and I can simply ask for students and then I can say give me for S one s one Right And I should get IBM right But if I'm asking for a student too I should get in fi ve pro Google and all right now Tell me one thing What if I want to know the So these are in order For example a second student has first joined in forces but I'm orderto a pro by fluke and er Google Okay I don't know How do you get from water from the protest Will but yeah happened So So I want to know what does a stake in company he wasn't So when I say s two I'm getting all this I don't want this I just want to know before what Will you do it Right So you can actually saw this is the way to access a list right So I can just say that I want the key within the key You know I just want So this will return the value within the value I want this second element which gives you a pro right Or like this if you want only Google you can do like this So these kind off operations are possible on the same thing I have written here Right Uh we will look at them later Okay So that is also one more thing called a pupil So this is called a pupil can you see How do you define a pupil is within the bracket Normal bracket All right s So what is a major difference between a list and pupil Is that list is mutable But you police immutable for example Eso This is two people So let's say my list equal to 12345 Right So in the er I fight or not Sorry in the Jupiter notebook If you want to understand what are the operations you can do on a particular item You can do one thing For example I have a list Now I can say my list and put a Dodge and put a tab So you can just like my list Put a dot and then put a tab It'll show you a list off arguments you can pass so you can always see there is something called happened so I can actually lower upend Andi Let's say seven So what happens here This if I'd want my list Meaning lists are immutable You can add elements You can even remove elements There is a method called pop and there is a method called removed Using that you can just pull the removed elements can hard elements But if I look at this triple which I created I say bu bi or that underscore e x dart I don't see a matter Karl happened sort Pupils are totally immutable Once you've created you cannot modify that So if your requirement it's like that then probably keep your day diner to perform it That's up to you to decide Okay so that's the difference between a pupil and uh list Actually son mutable pupils are notch by the way Ah Now one more interesting thing Okay I want to talk a little bit about the notebook actually Is that here You have this new notebook and all which you should know I really And you can actually save this No notebook You can sit down Lord as Northbrook So if you have added some lines you did something You want to save it You say down Lord as Northbrook file and then it can upload again back to Jupiter right You can rename it one more thing Is that if I have this line so normally what happens is that if I type anything here 1/4 to 5 it will execute it now I don't want it to execute I want like a heading So what I can do I can just type something this this door best by Thorn on I can just select this cell on their ihsaa cell type What is it Sell your cell type I can say marked down So if I keep it us marked down it's just you know pretend it And you can always have this hash option for example I can put the hash on that will increase the phone No So this is like and just make it bigger And if you'd like to hash this will be the second heading Third heading like that you can have very nice representation So always remember this because sometimes what I have seen people will mistakenly make us sell less smart down Then it did not execute the court Whatever you're typing just wanna print whatever you're typing right So you can just go to this cell type and say whether it is a court mile down there is also no convert Don't bother about it But make sure this court if you wantto run it actually night Um if you go to this help menu You can actually see what are installed So that ISS Nam pi ce ai pi Mac lordly sim pai and pandas There is also library for simp I will talk later but since you're seeing the help off these things that means these levels are already there I think ideally you should also see the similar things You can get the help about them But in case let's say you want to install a library in future So they said there's a new library Which game you want to install from the notebook The ideal way off doing it ISS this command B I pee in stall and then you can just mention what you want to install So if you put an exclamatory mark at the beginning it will run from the command line Right Soapy I P's are tool which we use for vital installations I'm just saying that I just want to install banned us If I run this well panda is already installed It's gonna throw another saying that now the library already exist where you can see a star here which means this sale is now executing still running So when you see a star here That's Ellis getting executor So it stays Requirement Already satisfied Pandas is already dead But in future so one way s p i p install pandas Another ways Conda construction and us So either you say p i p install or you can say Gandhi in stores So since it is part of an anaconda installer is called conduct and 2nd 9 story will instruct S O If there is a new library you can just google how things start Say a busy in now Anaconda Exactly Man will come Copy place certainly started but easy So any questions so far anyway is already in store Last village it's gonna throw another This star will be really useful when you actually build ml mortal Sandra Because when you're building machine learning mortars what it will do is that it'll take your data on then do something called iterations on the later really or data Then I treat multiple times so that sometimes might take some time So you can actually say that of your court is executing You needto wait for sometime right And that right now one of the problems that we're facing so I'm just sharing I mean some of the probably not exactly a problem but from my experience So we were always moving ahead with new technology right So in the older days the problem was that the hardware was a problem So I don't know what to do with the hard way So that is when you started getting this cloud and all as an option Now everything is have a little unclothed even your entire melt Things can be automated in the float apart from your local Norn book So initially when machine learning and all became very popular we were very much losing cloud for most of this building a male model Sanders And that was very easy And now the challenge is we are now from Emily We have gone to something or deep learning or a I kind off side eso in when you go to deep learning One of the problems that you will face is for the hardware right For example if you're building something on image classifications say so we're not the project which I'm mentoring So you have this capstone projects right So what happens when you do a capstone project You will be a group Let's say four or five guys and then there'll be a mentor like me who will help you So uh currently I'm in getting a capstone project So what This guy's certainly they have Ah CCTV camera footage identification That's what they're doing So they have the majors on By looking at the image the system should say This is simple actually described the scene It's not like it should say there is a dog or cat should actually see what is the content of that So they're they're building something similar in that fashion Now the challenge there is actually not building the model The model and all are fine But if you are doing things like deep learning and all you need a lot off GP graphical processing unit That is a challenge because GP was are very costly You know what his GP right graphics card kind off thing So you feel using a laptop or something It will not work because you will have what does GB or three g be Max So one possibility of the explored was that you can sign of it The Google Cloud and Google Flower gives you some credits for free So if a sign of it Google they will give you around ₹21,000 credit for three you can use Their service is worth ₹21,000 I even the dad that credits ran out because these guys need around I think 16 Gig's off GP or something So to train the model they took one or two days By that time the account is over So the 21% of bizarre going after three days So then why did you do right I mean so I think the next challenge that we're going to face is on the GPS side Like how do you get it right off course If you pay money will get it That's not a big deal But you know what is an economical way off building Ah big learning model There is a next challenge actually right now and it is very popular so because it is snot easy Luckily you cannot get it the only ways from the cloud probably but they're very costly So if you're running it by paying money then it will be really really costly night So I think they're still trying to find out what to do I mean so initially they build a model Is a test in the laptop that was working Fine But when you are really doing a project you need a really later Right So not like a test data So that time it will take more resources for you So you and I don't know far to do I just I deserved them You either pay or you know get some other cloud went there or something Google is the only living there Who is I think paying this much money Just sign of it Amazon or something They want to give you a rub Us nowhere Abuse has a scheme called a free dear That's basically cheating You should file a case against them actually for such a so Amazon Basically the same We will give you every service for bar near free In fact nothing will be given All right So they will give only minimal service is for rest level charge like GP Obey Service's and all they charge they don't give it free Azure is also giving some level of discount But so far what we explored is going up there giving good amount off free money So I mean I'm just saying so Some of the challenges right Even when you do these kind of analytics and Mel analytics normally you can work on this kind of Daytona laptop like if you're having even let's say a CSP file little except 10 million 20 million rose in your laptop you can easily manipulate using data fridge That's not a big deal But if it goes beyond that then you may not be able to radically run on our laps in the unit of Better Configuration mission Sober adoption will be that you can use any cloud er provider just to test If you're having yes you can sew One way will be that you take the same thing Tokyo When you're talking our distributor bay this small challenge will be that you need to learn a bit about it For example I can you something or spark There's something weird about his spark about This park is a big data processing platform I basically plain paper on spark so Spark has the same and Mel Capability sparks a portfolio and Mel mortars So what's part physically do it takes Let's say a bunch of missions like 10 or 20 distribute your lord but you need to learn certain things about spot Because for example I'm using here Number right number doesn't work in spark You have a similar later or or revenge is using pandas Pandas as special isn't work So they have a similar one called data frames Okay so there is some different So you might have a small learning curve my princesa End of the racing like you're building animal moral ml moral Everything will be saying But this exploring area part will be some changes because Sparky is having its own data types to deal with not the regular you know Warren Specie So that is one challenge we see sometimes And you build a model and you want to be distributed way You need to slightly change the area Part rest Everything will be perfectly fine Yeah So some of the people are actually ah exploding those distributor computing blood from Cecil So I mean just some of the parts I probably should be interesting if you go for that Not immediately Like so Now what I want you to do is so now we will just start working with pandas So what I want you to do with Can you see this light There is a slight cold Read the data Not a slight There's a box cell right I want you to run this life But don't run this numb biting so you can just come in this right now We don't need this So basically you need to run import pandas SPD But we need the data right So what do you need to do Go to the files that you downloaded it There is an excel file called uber Drives Can you see Can you open that the over rice I will I will never open from my side Yes so I just opened it Let me just I'll quickly explain what this is So basically this is an actual data set which we have collector So this is regarding the uber crypt data So how do you explain this data So there is a star date and nd s o star date is when the report starter and end it is when it was in there Most of the cases of the same date same day trips Then there is a category that's always business Then there is a starting point off the trip for the uber driver So these are all cities in us right New York four Pierce and our And then there is a stopping place where the trip ended and the number off miles cover And then there is also purpose off the trip So this is how their data looks like And if you select any of the column you have roughly around 1000 plus lines off data So we were just interested in analyzing this data We just want to ask some interesting questions to this data on We want to do that using dependents murder Right Um one thing that I would suggest you do is that just reading this file So because it's called the uber Bride who's won just remove this one discarded as overdrive studios and 16 I don't like it That is a warning actually when you can keep it But typing that will be problem right And then what do you want to go Go to your notebook and in the notebook you need to upload it So go to the home folder of Jupiter say of the Lord and upload over grace open appeal See it will be here Can you see So just click on the upload button cell ignoble drives click Upload it should It should really show you on this screen up the home page like this Andi Ah by Thorne has a lot of ways to read their different type of files Am I Right now they're interested in CSP You can also real excel files on other format selfies So for the time being will concentrate on CSC files If you want to read a CS refile all you need to do is that you can say so What is this line Import banned us as baby What do you mean by this ass beauty It's an alias thing right There's an alley s So the library that I'm in boarding it's banned us I'm saying that I want to import bandas as beauty So the SS finally asked me on a superior will refer to pandas If you want to read the CSB you think pandas you simply say really dot Read underscore See Yes we on just give the name off the file on then if you actually want to see the file Okay You can simply type B f Oh I didn't run this right Sorry Your toe first Run this right I'll run this Them from This might take a moment polluted so d f will be Ah you're uh file I'm in the variable in which you are reading it I don't know why it is taking this much time should be very fast My PC is actually very slowly for some reason Let me do one thing Okay I'm just you started so I'll just say import this What are they saying It is still start for me Are you guys able to Ridge No Yes So can you see this So the moment you say the f it's gonna pretended Now the question is what exactly is this thing called A B F DF is called a date A friend bed The name can be anything off It can be recouped I'm saying the data type is called a data free What is the date offering our data offering is the basic data structure we have inside pandas which represents your data in the form of rows and columns So this pretty much looks like your Excel spreadsheet And that data structure is called a date offering So in order to create a data fraying what I'm doing I'm just reading the CS We file using this matter on designing tow this variable cord the yes or deface my later train No my data frame Right on If you simply five The name off the data frame It should bring the output like this so you can actually see what is inside This Um also in python If you want to verify the data dive off anything you can do this you can simply say I think it should work Type off the F Yeah So you can just say type off DF it will say banned us core fraying the tough times So basically this is a date offering So pandas is actually built on top of numbers It was built on an umpire library Actually it does not directly use numb by The matters are different but the core data such as are built on number They mean you to use paint pandas Actually no umpire doesn't have any building libraries to read Excel files Or so in nam pie You don't have this labels this I bring this the f You have this Stardate indeed category Those other call on her nose like that structure is not available in the pipe So it anyway will will not be able to read what is in the later so in order to read I need a panda So I'm saying that in the data friend you can actually call it as our carla meaning that it's possible but in number there is no way to do that It doesn't support any labeling So now the question is that if you simply say read underscore CSB how it is able to you know read it like this for example this is my call on her Doesn't accept Sorry CSB that it is really so by the fall the same thing is like the first tour will be considered as your column Now what if I don't have according here Right I get a CSU file There is no call him here There So if you Google for this beauty dot reach cst method there are some arguments you can pass You can say skip a line skipped the header add a header or or one common problem that we have when you do things like this is that even though they're not very stranger number this We're always looking for data types right So one common problem that you're going to have is that if I read this this is fine I have something called a star date and ended eight So what is this This is dates right So normally you want to do things like I want to surprise one day from another date but I want to compare two dates If I want to do something like that they should be represented in their date data type Right But the problem is that normally when it reads this with the string I will show you how to look into that But normally when you simply say period or UCSB it's gonna seem to the string Everything is a string Unless it's a some indigent This will be in India or off Lord But these are all strings So in this PT are not read CS team after If you read about it in pandas I will show you There are some arguments you can pass For example I can say the third and fourth column Our dates please consider the mistake Data type or skip the first line That is not the header etcetera but as off Now we're It's simply reading it because the hair that is also the same one here night now I think we will be able to see this If I do Are the f Darch look at you You can always there would be if dot d types What does this mean Each columns day that I I want to know Now whenever you see an object that is nothing but a string there's a stream So the class is actually called object That is what is throwing us object So the only column it is identifying ISS florid 64 That is a mile Scarlett That is okay with me But my star date and ended this object or string I don't want it to be string I wanted to be late All right so So there are ways to convert it out for you No no this is an alias for burned us So panda is my library I'm just saying that I don't wantto type pandas don't really see us They'll stampede or sees Ali s ing I can also say import pandas Ezra Gu and I can say that good heart rate sees you You can you can You can You can simply say import pandas and pandas underscoring but these are some off the common How are you You say things that we're doing on when you're important number You'd always say important empires Tempe There's a standard So even if you leave production course he will say NPR dot NPR dot Because they could have a border that's empty normally right now another point This which is good if I do this from pandas I mean import pandas as speedy or not a drawback is that this will import the whole pandas library Maybe I don't want that Armand All the functions I want only selected things So you can say from band as import only selected for inciting Haci from pandas Import read on this force is if I do this by the problems only this will work I can't do anything else No no this is just This is not a book She'd This is about North book So this is just so are these imports statements I'm running here they will be running You sink a man lying behind the scene One more thing is that what I did is that this uber driver CS me I uploaded in the this thing notebook right And then I say I read it So right now this is in my directory on the original file Doesn't get changed the original file because that is a good practice I have ah CSP file on If I directly read it from where it is existing on I lose a manipulation then what will happen My origin and file gets altered So all these changes what I'm doing it is not going to affect my original five So you could have downloaded somewhere So this Philip Lord it's its own workspace tonight and whatever and And if I want there is a matter called a really dot to see a sweetie So if I have this data frame I created this did some manipulation I want to save it back I can say period to see us me back to the CS That's what's up So this point is very important because when you look at the data types this diet is subject this guy's object That means my dates are actually string I don't want string I wonder Date later time I will show you how to convert it And it's a very common requirement because we have examples where you know you're getting sensor data in certain interest on the sensor data will have a time stamp card It was every time in their abilities sending the later on This will be in the UNIX time Stember format which nobody can understand Like it'll have milliseconds microseconds nanoseconds on date and time So So what will happen then You really test a p d r dot CS three or anything It'll say there's a string All right so then you have all string values You can't do any manipulation So ideally you should change that into a date Data I will show you how to do that But that's a very common thing that we do And one thing to confuse you very much ISS Okay so this is really confusing if I just d f my study You see this right That is one thing which is confusing here Can you tell me what is confusing No confusing no No no The point is it has something called a row index This 0123 This is very confusing Normally in excel you don't have this You have it but you don't really care But in pandas is very important And this is one thing which confuses people are large because if I want to access the data I can say call on vice or for always And whenever there's a row vice I have to use these numbers and one way to do this is that so many red by deported out of these values right I didn't specify one day that I keep it like this Another ways that I can define my own index I can't say I run 10123 I'll take whatever I want You keep it here and then store it That is also possible I'll show you in the upcoming sessions but this is something which confuses people And even though we call this as a column it is not a column So we say this is a column in our normal terms were saying that there are rows and columns This is not a colon This is called a series S E R E s Siri's Siri's is a later So basically a data frame is nothing but a collection off series There's one series two series three cities four cities That is why I eat serious hassle later type So in simple terms So you'll say that you read us the yes very file CSP file has seven columns That's what you say But in reality when you look at the data types huh this is a date a frame Our data frame is made up off my different cities Each one is called the CDs Actually you can see the data and I've actually off the series We did any types on each column It was saying What It's string our object So that's called a CDs actually and in some off the classes I have seen people discussing how to create a series from scratch But I don't think that's not really important because when you start working in production it is very rare You create your own data you'll get it from somewhere that look I'm an ex loc issue or something I don't think that you will create your own data by typing and then creating But that is also four Simplify one man who they can create a tough thing I can just feel in these columns and creator later friend s o b read it right That's what we have done OK so one more thing is that when you simply say d f it's spending the whole date offering If I have like one million lines it's not really a good idea So you can always say the f dot head etc Eddie this is going to give you the first filings That's easy tonight because you just want tohave a sample How the how the data looks like And one interesting thing that you can do in your notebook Is that so This head is a function that we're calling if you want to understand what is head doing You keep the mouse cursor inside this head and press shift tab Why did say it is saying the f dot head and equal to five That means the number off Rosa five If I want more I can say and equal toe tend So how did I get it Just keep your mouth inside this press shift tab it'll give you an explanation off the method that you're calling Well folks in some cases in some cases very complex functions will not give you the entire detailed explanation therefore that you wonder to look into the documentation But sometimes it is handy So always do a head and always do a shape Another one is a shape What is the shape it will tell you How many rows and how many columns are there Now send one for basics rose and then seven columns This is really helpful bitters You will be doing a lot of filtering manipulation and then you want to know how my data looks like How many rows I read harmony So always do a shape very important orders to a head in common data manipulation Right So shape is another method you can use Yes So we were able to get the later right now one off the things Like I said there's a problem is with the time that we have right So what we can do is that we can actually um you know convert this star date on Indeed these things But before we do that let's have a look at its already We did ahead than shape You can also do what ails which I'm not really interested But one thing you can see from the tail is the last room in the last row This is actually a some off All the value like this is miles So they have added or the values And what is this N A and this n n is sort off like the official junk value in banned us It's like No it is not know the name stands for not any number So whenever our data frame doesn't have anything to Philly or if it reads something which it doesn't understand it it is not greater That's not there is nothing called Nigel actually in our data frame in start off now you can say anything Yes So the use for use cases will be Sometimes when you read the data that will be some junk character which Brenda's is not able to understand On the Jenny character can be anything It will be one character So all those jen characters cominat commonly glittery doesn't name Now I can see it's electorally man Replace everything with something say one north grow or even the mean or deleted something like that So it's officially usedto remove this kind of junk values Why you're seeing it here Because normally in a data frame it has a fixer a role corland structure Everything should be filled I really right So in the last I'm not able to add all these things It just simply feel it within and values Here you can see only this fearless attitude right You have 1000 or 12,000 something right Some value it is I'll show you Yeah so we'll just look into that But before that I just want to show you one small example So have a look at here Uh so this is something which um maybe interesting because what I'm doing here is that there is a method called three D door date offering What this allows you to do is to create our data offering from ur own definition So normally you will really CSP That's a different thing So like know what I'm saying That I'm creating our date offering myself And if I look at the data frame Oh sorry I should run this first It looks like this right So how I have created a data frame This is a dictionary right What is this This is a dictionary on in the dictionary What you have you have key value Prayer is the key B is the key Sees the key So if you pass a dictionary the keys will become the column names The values will become what Our values who have on the index row indexes anyways over in tow All right so it created a data frame like this on now The problem is if I look at the data types again I have the same problem because a is I don't mind It's a string 1234 Beat B is indigent That's also fine with me But see for me is strength But see is actually their date right But it has already Tessa strength So what I want to go I just want to convert that into a date Data Right now there are multiple things you can do For example that method is actually this score beauty Darch Two date time So this is actually the method and you can say something like 2016 Uh Julian I do so There is a building method called a two day to find and what you can do You can just pass a string like this So I'm just passing 2016 June to write What it did it automatically identify that year is this Monday Six date is this and the time is 000 All right this is very important You can And what if you wantto Passmore tipper things You can put a list Sure 2000 too high for 16 right For anything to 1002 iPhone No So what is that Wait by default What it does The format is what year Month on deep But what if my data is not like this There is a way to convert it to mention the format So by default if you look at today time it will mention that so in Python that is our date time life buddy But you can import once we imported inside that you can say that dot format on my month is I will quickly do one thing I know that I'm bitch offloading the discussion Where is that There is one more thing by don file Give me one minute Okay My hand down ll order it I said only this Okay this one Sorry guys I just extract this What I can do us I can go here on my side Floored I can't say there's a file called Final in this This I shared it with you guys Okay Don't worry By the basics this one I share it with you know already But I'm just did the detergent in the teller Yeah So there's a built in our date time function So if you're particular about the date format what you need to do you need to import this date time You can import this daytime date and time on Then you can do something called strip time so I can mention that you know this is my data And the format is like year month our date Our minute second this bacon mentioned So you can go well around It starts off by the fall But if you look at this two day time this will accept only in this format But I can say like this right 11 Do that should give me now What if you have multiple dates What will you do I have four dates I want to convert tour dates You always keep it in a list So the advantage off list is this So whenever you want toe keep multiple things we can keep in a list For example I can say that this isn't a list Oops Ah Now I can say give me one more date Some date Ah 2020 Coma One study 1-1 So now uh hopes doesn't gun work now The D papers state paying late It converted Right So if you're having my people elements you can actually pass it as a list and they should be able to convert If you have a single element you can just say that converted into that quarterback But that is not our problem right Our problem is that we have uh what you say Our data frame that is called them on in this them are So let's run this very stem them Yeah So in this time I want to convert this See Right So how do you do that I think I deleted it You can say them off So what is a calling that you want to convert See you can simply say Peary door So when you're converting this is what you need to do Uh this is have you access a column from our data friend So you will say Embrace my data frame I want to access the c equal to I want to do something with it I want to convert a daytime off this column so this will be replaced by the SEA column So if I do a M door day types and C C's now daytime so any column that you have you can just do a daytime and convert it And usually that is a good common practice that we do many places You have strings and you just one day family For that it will change the original file So when you're doing these kinds of mortification the original data frame temple get changed Now if you do not want to do that you sort of designed to another Our data free men do it on That is another very confusing concept in Pandas in Panda's there are some operations which will change the original later fame some operation which will not change For example if you're adding a column you can add a column toward a tough thing if you're adding a column that will change the orders in a late offering So what will happen if I say Just add a column My data for machines to someone but sometimes And I'm removing a column you know there was any different Well you're get changed So I will say OK data offering Remove the last column that defined again in a type date offering the column will exist So I need to assign that to another variable and then do it So in when we reach those respective examples I will share you probably But in this example yes the original data friend exchanged So the day that I get changed actually ah now another problem That might happen ISS If you do something like this can you guys try this I want you to do uh what can we do Try this Do a beauty dark No date time Okay convert it Ah let's say you have a series off values this job values next day 2016 Ah September 11 Coma ABC Very very very possible Right Very much possible I'm going to read a column that I'm reading a column or and manually passing the value I want to convert into a date format The first value is fine No problem The second value is worth So can you turn it Obviously you cannot convert ABC toe date right So you can go two things One thing is that you can filter them and then say that Okay I don't want them Another thing is that you can add a perimeter here I forgot the name actually and say Paris and Liz this sea or e r See So you can just add a small argument seeing other secret Of course what will happen is that it will understand that you you you want to convert the data wherever it can convert It'll convert wherever it cannot convertible say an 80 anything It's not any time again like God of Israel So that will be useful right out of ice You even if you have a single value to throw another and say that I cannot convert Sometimes it is useful because you re time stamps and it is not able to understand our So at that point you can even I forgot this really simply somebody else So sometimes this is it will not come on top of your head So in the last life somebody asked like How will you do it So then I I don't remember on my top of my head But I found this really layer that you use it normally before match will be filtered before you get it But you can do it anyway so you can add this line in case if you wantto convert it you can also say a beauty to numeric Okay so look at part they're doing here Ah you have this a right What is a is a string I can't stay apart so I can save beauty to numeric off eh So what will happen It will become indigent right So sometimes you can convert from work string to indigent way have our second water from string today train format possible And there are some more things So I think I have it in here So you can mention the format month date and or if you want so that it's ah you are all I have given here Issue TBS fight on tree library day Fine So you can just open this toe understand more about the you know daytime functionalities that you want to use on Also see when you're actually doing ml classes right there this will be again repeated So don't think that right now you will learn that is the only end of learning So if you're having our data where you have date definitely you will deal with it right So this is just like an introduction There is a very useful method called that Describe be if not describe Okay What this does probably you can understand It will describe our date offering For example that is start date end date category start stop How many are there Count off them How many unique values are there Where is a talk value The frequency means standard deviation Minimum This is a 1,000,000 kind off lift off the value for the value of 75% its value and the max So this will be useful up If you want to get a look at your data set what is a common column Values wherever it is not applicable little say in and values right So some of them that is no mean 1,000,000 releases a name Other places it is actually going to display them So the describe method is very useful If you want to get a idea about your data and you can see include our or you can say include only the columns that you want There is also an info right in four We will look at later Huh On another thing that people usually want to do So right now we have this over later Right In the uber data we have a column called a start start stock start Is this talking Going off all over trips right So I want to know how many values are there So you can say a value come So these other locations So there are 201 trips starting from Carrie 1 48 unknown on 85 Morris will I don't know There's an Islamabad in New York some earlier but that party other is one that is also Karachi I think I don't know why but yeah so So this is very useful because you can know right How many unique values are there The count off value so you can always do a value counts and then you can go ahead Also if you want that so it stays the top five locations No don't fight for obstructing points Something like that A very interesting thing Okay so let's look at the common data manipulation tasks that you can do So when you get our data in the form of our data frame right Ah some off the things that everybody wanted one is selecting indexing the data I want to select only one column three columns two columns very common Second is filtering the data always possible I want all the values greater than this less than there's something like that Been sorting the later for sure I want to start it mutating conditionally adding the column I have all the sales revenue I want to add a new car on the total or the mean off This something like that grew by some rice So for example the uber trip data I wantto grew by the start location and find out all the clips which are having more than 10 miles or something like that This kind of finances So if you can get a basic idea about this fire tasks majority off your lawyer What you say our e T A Is done right If you can do this much that's all you want to do right Waters And ofcourse then another thing that might come inside This is functions So for some of these tasks you might want to write a function Toto perform it We will discuss that But commonly if you can understand these many aspects off it that's more than enough Now when it comes to this selecting so first thing we're going to Rio selecting there is a small thing you need to understand that our junipers one is called I look six second is just called lock So whenever you want to select the columns or rose from our data frame either you can say I look or you can say Look this I Lakis actually depreciated We have been using it Even it is supported I don't know what version you're turning in Some of the parent ask questions if you run the I look it'll throw a warning saying that this is all that is duplicated and or my I think in the future versions they will remove it Lord is what everybody uses and the difference is very simple I lot means this is accessing a row or column using a number lock means by the name like I'll show you a practical example but basically I look and lock are just so I can do this So let's first look at the data offering So if I want to get some data I can say I look okay on dhe Then I can't say Let's say the road Who Five Justin example Okay on Then I can say 0 to 4 So can you tell me what happened here So when I say I look the first part is the group The second part is the colon So the total five means row number starting from 01234 It will not include five and 0 to 4 means work Gore Lem's 0123 So basically you want these five rules and these four columns So the first thing that you mentioned is the index that is a rule The second thing that human mentioned is the call That is how I look works actually Ah from that you're slicing up part of the data so that is indexing right Also you can do one more thing If you want selective columns you can go like this I can say I even on leaders of road column And the third quarter on this trip like this The Rotor five lore zero Carla man third column Zero parliaments for Scotland There Carla means this right Can you guys try this Let me know if you're able to do it Not able to do it Okay let me ask you one question What will happen if I do Only this Will it work Just a question The before and what happens Because the first are given your passing is there Right on you're getting already columns But if you want the other way around Rita you're to say Colin Oh my is the doctor for What does that mean This first call in his work All the rose I want all the rose Look at here 1234 or the rose So it just stays extract cetera doesn't display everything I want all the rules but only these columns So remember this If you simply type anything what will happen This These are one too right Only this and all the columns will be displayed But if you want or the rules But just the columns that you are interested You can say this Ah selecting or in selecting certain columns that you want or roast that you want That is what you're saying selecting our indexing user in exposition that you're mentioning zero or one or whatever Now you want all the rules but and all the columns What does this mean All the rules All the columns except the last corner Do you have the last corner Where is the last column you don't have That is not dead But it was a night I think part of us right now One more thing Understood Because I have to do this Rocco Equal toe this okay on the f dot shape Iran who not shape So um I said from the later frame select all the rose except the last column Fine But that is not going to affect my own personal data Friend My or generator frame still has 1156 rows and seven columns I'm saving it as another data from called Ragu And when I look at the shape of Raghu It has only six corpse So whenever you're selecting the result has to be saved somewhere or resented Later Frame never gets that they're so obviously it's like a projection So you need to save it Sequel What you do you will say what Select this column discord in this column at something Right And you want to process it off course the original table will not change You will stay with us under the table or something Same thing So always So This is the importance of shape You can see what is happening in the row level Call of level right Questions or Milo But yeah I don't I don't think like I locus very important because like I said it is sort of like a Replicator Why do you think it is not important Exactly The column names if you have a corland names were going to remember the 6 10 7 position That's like beer right So So that is why lock actually came And mostly we prefer Locke and Locke is easy actually So label based indexing right So I can simply say something like this B s dark lock Andi I can say call and coma can sings Start Stop So what is happening What I can say uh as a list right I can say I want more There's only one column I want or the starting and stopping location site What is that Court is a corn stalk Start Stop Starlight I don't remember Stop Start huh What will happen if I remove this thing This corn doesn't work right Because you are saying that Rows and columns Right So you say this on this now I think it will work Just give me one moment One moment One moment No this look Busan works here Yeah So uh since this uh label based indexing is very common like people are doing lock these days a lot on one of the other thing is that normally when you are working with the CIA story file or an excel file you're mostly interested in the columns Not in the rules Rose are there But you want to count calculate the average chauffeur column or summarize a column So these guys have become more liberal and now they're saying that you should not type hardcore like a low C Then call and it's So if you simply sit dear for these corners you'll get it Can you see what I typed I really want you to use your type the F dot and losi then to call Amanda all roads But no I just want to see two columns This is easy So this is sort of like a shortcut on And in the actual programs you will see like this They won't say law Kandarr because it is assumed that you want to look at a column right Unless so unless you want to filter on the Rose right now I don't want to filter on the Rose I just want to see the columns I have an interesting question Okay not question I want to show you something Let's go back to the original away Off doing it The how'd you do it The f darch lock All right then what do you say are the rows and columns and you get the same thing right now Pay close attention I'm going to do something Pay close attention The F doc lock Ah I would say colon coma Let's say only start start Uh but a star start is missing Okay start Is there display close attention This is the output off two columns I'm going out on one corner You find anything interesting Hi head Uh maybe here that is gone That's okay Yes this this doesn't look like nice right I mean so previously the uh you know uh this Waas if I come in this event learning too This looks very nice in representation The other one doesn't look like The reason is if you're giving only one column right If you're giving only one column remember that is a series This is a series You remember I told you our data famous made up off individual columns So this branding it as a series how do you know you can actually assign it So I can say ABC equals so this on if I check the type off ABC what do you see CDs It's a CDs All right Okay I invested So what do you mean right but probably in your data manipulation tasks you don't want the city's I want only one column but David estate offering I'm not bothered about learning Siri's and all Then only one small thing you need to do I think it should work but just to keep it here How many black It's too I'm the steward I'm just passing it as a list list So what will happen It will not consider desert It'll be done Answered a tough thing because these these are small things but may become very interesting because you are You're working on a very large data on you're doing some selection on a single column right And you already ate You selected one column But the output is the CDs Then if you write something on a serious hit man or work because all the functions that works on a date uh Freeman or work on the cities then you're wondering why it is not working I just wrote So then you might wantto pass it inside a single list so that this guy will be returned as sir data Right now this ABC will become a great difference A proper date right now Try this out if you weren't Okay So if you want time tell me I just keep on going to next Going to next If you need some time to do this then tell me on I don't think it is there in the notebook Fact Share it with you Some of this eye makeup It may not be the end or is in that file that I'm sharing you No this one is a tough thing Previous one loss of CDs No I just previously was bossing this Just that starts stuff now added up this thing You passed it as a list then the result will be a great offering So what I'm saying is uh how do we show this so relation with this Yeah So what I'm saying is normally if you want to select only one column So let me comment this open Normally if you want to select only one column you can just pass the column Right So what am I doing here I'm saying that be after that log I want all the rose and the Korean name is Start Start No problem Right But if I run this what will happen is that it will create a series This is called a series on The only way to identify whether it is a serious or not is by doing a type and techno later type of this series is nothing but a single call A single column data I was called the series right But I can't convert that into a date Nothing Even though it is a single column I can have a single column Imagine Like an Excel file with only one call for seven Right Nobody's saying that you cannot have for leaving column in an Excel file Probably I wanted like that Then I need to pass this inside a list So look at what I'm adding here this stock stock What am I doing I'm adding this thing Yeah See I'm adding I'm passing this as a list right And now if I run this you become a date of thing So you need to do this If you need that return data in the form of a tough passengers So normally you will use that Squire bracket If you have my people columns even for a single column you must use it I really did Only the output will be a great after him I'll reserve your cities Yeah Lakis are sorry I lock is the number you are mentioning Lock is the name So here I am saying lock and start Start If this was I Look I said to her third quarter Yeah so I have to mention only the number And I look a snort lady common disease because you have a C S B file on dhe People don't want to just go for the number right You have the label The label based indexing is actually lock That is where this common great these days must leave us lock So why don't we Why don't we do an in class assignment right So I have some interesting assignments So the thing is like the ones which are easy to demonstrate I do the ones which are difficult I give a citizen a segment right You know you're not smiling Yeah So this is like a list and God is fictional So So yeah it really accept it Also the lock is ah method which will take an argument Asbury's heroes And where is your column So I am passing a list off elements not a dictionary is key value pair So here I don't have a key Your value right I just have called him names So you found passing only one column I can just have one column I don't have to use a bracket Or if I'm passing multiple columns How do I say my paper around brackets I don't think it is supported because if you look at this be if not lock matter it say's that input has to be a list I don't think it's support but let's try Since you have asked you're saying people right So I paid 5000 dictionary expecially and making yourself with just anyone I just So what do you do You pass it right My bad suburbs In fact I have I mean but it cannot be a dictionary I know because if I'm passing a dictionary it has to be a key and a value But I like this like this This is that difference is that that's what I'm saying So I passed There are people but the data type is a serious now It's not a date of thing I really your capacity as a list The ideal condition is that your capacity as a list If you have only one parameter you pass it as it is as a string If you have more than one you pass a test No but it is treating it as a serious That's a problem That's what I'm saying You can go but you will not get it as a date after him I'm saying I don't like this It is considering It is just like a string It is not adding any value Actually if you're passing it as a pupil because even if I remove the pupil I'm getting the same old I got it But what if I passed two elements Let's try that Right What if I'm passing two elements here Start star coma Uh next day Stop Stop What is Ah Korchemny Uh now it is giving your date offering because there are two columns who called himself But if I'm passing a single column as a pupil it will not allow me to construct a date Africa doesn't mean that we see this No they have things None of this is just recorded about it But if you know how that goes No no no This means this stupid cannot be further mortified But I can add according to data from again that is possible These are the original What is that Columns I have in the rate of you Yes Yes because that is the property of the date Afraid not What I'm passing Actually our data framed by default allows me to ad columns and remove columns If I want I can say or his enter data from should not be changed or it should be changed Both options are there but I have seen So this is all so new to me But I have never seen anybody using So in fact I mean Thio tell you the real answer Probably I can check Why You know why somebody should not be acceptable Ah but I have never seen Because if you're passing more than one values Ideally it's a list off values not a pupil of values usually views List is the mutable What is a collection which allows multiple elements to us Oh it's or this one This is what you say So Ah loc So loc It is not a function actually so that brackets will come in You're calling a method right This is not a method actually LoC is a selection attributes So when I'm calling loc I just need to mention what are the index It takes only two separate positions one is the roll One is the corner Yes it's not a function If a Lucy was a function or a method I'll see that But I could send options That is Norbit So whenever you see those brackets admits it is a myth this is not committed So why don't you do this in class assignment Let's see how many of you can do it Can you see this assignment I just show you this one extract 1st 5 rows and columns I'll just explain this I never date a frame where I have Firoz on only these two columns That's the assignment So you can use my local look That's up to you Okay You can use I Look our luck next after you huh One woman does think so Even in the summer some of the things are very How do I say new to me See I roared the solution like this right This is correct Right So now I got another solution Where Here You put a colon and the new door dot head Yeah So see it is up to you to decide But normally that iss not something we do I know Head will display elements Top five elements Right Uh butt head This normally like a limit statement in sequence and sequel You have What A limit Statement So what is the purpose Off a limit You just want to see the data Where will you use a limit You have one million rose We just want to see top when the rose u Selim it so I can actually do adore Heady will get the open but usually have any When somebody asks you the worst election okay What they ask is actually I want specific rows and columns like I can also write a dart head I know And but ideally this is how you should write it on Somebody was also asking Is there a performance difference I didn't see any performance difference Even if you're right ahead or do like this But something like this should be I really ran So I'm saying Well you are You are You are to decide I mean if you are interested in doing a dark head that's also fine I have a question I want that with our convention ng he's next So which one star high on your next mission we conduct should be starting No not this one We cannot change This is actually system generated Index Right This road before up 200 This system generator right If I want to do any condition I have to do the condition within the start date I can say within the start date I want the field generator from this day to that day So that will happen only here I'll show you how to do that But this index will not change now What I can also do is that I can change this index because right now there's a stimulating indexes or 200 I don't want that I can say in start off this replace this index on have an index off either our date or any call Um I can address an index I assure you but you need to have a corner but you can take it as an index will take an example No no no Thatis different That is the sequel Indexing This has nothing to do with that This is just in next minister position So the attack that is talking about the school indexing that is totally different right Because in that indexing what it does It calculates the mean median and some statistical values off the columns and story That is how it really is Faster she had It is just the mentioning off the position holding booth I know you want to do this and I don't know how you do it and I don't I looked one difference will be fired right This one you cannot pass Ah started and Miles Jr to give the golden number What would be that Give me that You go for more No it doesn't From another way your fight You just have to get No no no You want to know what will be the number of start date right No I don't think you have to figure it out Only based on the position on that this why I look indexing n'est ce Not very popular Because if I'm giving your data frame on I'm asking You do And I look on this column only there is that But you can say dark columns and get it I can say dot columns Right I mean you don't have to look at the date offering What's the name of the rate of frame so I can do a column Said I can see these are the columns Now I can Cicero inquired on how to read the entire file to look into that And then I need to say so Throw one There is no other way to do it Actually exciting We can move really long Your columns then I mean okay so we just did selecting and I just wantto bring your attention to filtering All right What do you mean by filtering You can filter the data Right now there is one small thing you need to understand Filtering works very easily For example one off the condition that I want to put is that I want oh filter all the trips which are greater than 10 miles So there is a Miles column and I need to fill it But before you run this don't directly run this Um let us see what will happen if I do this only So I'm just hoping only one part off the court It stays in the data frame I have ah Miles column I want everything which is greater than men Now don't run this Can you take a guess what will happen if I run this So the condition I'm giving is that I want to look at the Miles column in the rate of thing where I want all the miles We should get to that Then if I run this you because this is how filtering works That's what I want to know First you have to identify the condition what condition you want to put I want this column greater than this That's a condition If you only run the condition what will happen The condition will be there Satisfied or not satisfied You will get through or for us basically But I don't want this right I don't want this I want all the trips there More than 10 mice is required That is where you need to add the condition like this So basically what You're Lewis instead Off this you will say I want to create another data frame where I'm using the lock And I'm saying that I want all the mice greater than 10 coma my star And if I do or do you have to Well another thing is that Why are you doing this The last mile star then our little display The column If you simply add the filter condition it is not going to display the columns Then you mentioned So do you want one more Corland You addict Right I also want work started Stock fate You get the first aid for just starting so get the fuck statement like a of mice stock trivia Then be indicted on state street markets You can get So this one has this in national guessing Okay without the law for this So the line here Yeah I have I do Give me one You want another Give you No what happened But now what is happening Yeah So that's correct Right But you get all the columns right You're getting so you're simply saying that in the d f I just want you up like that is absolutely possible You're just filtering that But what will happen to your orders in our data from do you think Because the conditions that you're adding right it is sort of like you are filtering in based on that right So that Sosa Yeah So this is one way to say that But here you are saying that I want all the columns So she have the only difference is that so I told you right you It is not mandated You should use lock lock you can remove So I can simply say the f off this thing And then I can say I want a really specific columns that is also posted Yes Oh here The only difference I want on little one Yeah Yes Oh you do go e I know that it is coming You're right You have your money in two cities Nor does it Okay so what happened So you can pass that right That will be immutable So you cannot say that I want to put this roar for Carson condition and then pass to call him That doesn't work All right so that sort of thing backs at it So whenever you're doing filtering party need to do that Either you apply that to all columns or if you're picking Senator Collins they say I law or you remove it And then you say I play that only on this No Previously what he was doing I mean what he was asking Is that why it was throwing another So after this it was I was adding two more columns So that is not possible If you are doing like this you are applying a condition This is like a war for his condition So this will apply to the whole later friend So now from this probably against I want to select only certain Collins I have a day off to now from the day off so I can say I want to present only two columns in this statement I cannot say that Corland I wantto Adam Lee only to Collins I went up like this is the way I played for the horn It is saying serious subject immutable It is treating it as a series So this condition it is treating is that the cities we landed right at me So what was the condition we added Yeah You can have an and and Or I will come back with a question in a moment Okay You can put in and and or I'll show you examples Okay so but basically what we were doing I forgot So what we were doing uh he was saying watch be If not I log So you have not knock right then what was the condition Be a four mile straighter than 10 Then what was it What was weird way I did this So this is a simple a filtering that you can go Okay Sorry Let me Just So what do you have to dart head I just told him all of this All right Um now eso this is Somebody was asking you can or sword directly Ah apply that So here I am doing a d f dot miles greater than $10 head So basically if you want all the columns then you don't want to say select specific columns You will get them right now Uh one more thing Which is interesting Is that what If you want to find out all rights to or from Carrie right It's not too Gary from let's say New York Right So in that case what will you do You will say the column name equals two What you going toe New York Right And then know this and I saw So here you have ah String column Start start Isa String column And you can do a equal on one of the things that we commonly see here Is this east in Operator This is very common What is this is in operator So you want to match my people conditions For example I want to select the start location Find out all the rights that is Ah you know starting from carry on more spill So if you wantto know Look for my people locations you can do this Or let's say you weren't Caddy and New York If it is having either carry your New York it is going to give you the answer So and he fired Oh uh let's save it as something So let's call it as my rights or something And if I do are my rights easy Hey so here you have matching Gary and you and in the sense either if this Carrie or New York it is going to get it So this is in operator is important and I don't want to actually get into the discussion but there is something called a regular expression sort So you know that right So you can always write It rejects So string operations Whenever you do you writer rejects to exactly match or select starting with this letter ending with this letter these things way today the numbers don't seem to be updated Oh yes Oh yeah S o I wantedto clarify that I think somebody asked during the break So this is very important For example let me show you something So what did you do You really feel right And you have my rights If I'd wear my life's dot I lock I'm doing on my rights off I log right And I'm saying that the road of five Okay Coma Stop What does this mean I won t Huh This one work right But if I do a lock Okay I will get nothing Why Yeah this is very very important because see you are doing a filter right Nope Sorry You're doing a filter right after the filter condition You're picking all the rights from Carrie and New York And 1st 6 right sir Nor there it is starting from seven Probably around 78 and 10 22 So when you say a low c it is the name based in vaccine Call them names and road names Not rock proposition So and it says notify nothing will come It was a 78 and pending on level That is also a confusing thing Because if you're using a low C this is the name off the road north Deposition Use it and you can be Yes I will show you Ah there is one example So normally in these cases what we do there is a technical reset index So I store This s my rights Right now The problem is that in my rights that index is work 78 I can't say there is an option called Reset Index Okay so he fired Will reset Index This all started 0123 But I have some more things I just show you how to do that But that is one thing we commonly call from Oh yeah yeah I think I want to take something Yeah So this is our lives from we shouldn't be of the offshore work right Because of the f s my origin A late offering verified O d f What is the f that will have all reasonable That is never going to change But I'm saying if I don't my rights my rights that index is different because it is really telling only those columns Okay well here You said you want to use You can do that also if you want This is the easy way off doing Actually Ideally you should say dart lock and then select according But pandas actually allows you pathetically access the corner that is also posted But there is no difference actually because originally So how it happened was that originally when data frames came in the picture be horribly I lock There was no lock actually but then it was really problem because the coral I'm indexing is very difficult Then they introduced this lock on when they introduced this lock They said that Okay since to differentiate them you say lock and I look But if you are very comfortable you can just say column name it will work I ah in this case because faster for you are mentioning the column Ning right If it is I look it has to be the numbers and then it should Right Um so I will just uh remote this um So what I have a low in the rest off time I was just How do you clear this I'm not able to clear the air Okay So why don't you do something I'll give you an in class assignment So just like all the assignments find all the trips with distance greater than 10 miles on originating from Carrie and Morris So So what is the condition It's a filter Okay You want to find all the trips with distance more than 10 miles and or resonating from Carrie and Morris So what is the challenge that you're going to have You are applying to filter condition How do you play to fester Condition on dhe Put on it will be fulfilled her condition And when you're put in and what you should do you should put a bracket So Brackett on bracket just just give it a try I mean I'll just give it us Try if it is working Fine But other ways there's a documentation which we have like when to use swat But what usually But this is very common right Even in other programming languages But you have an end or you put them in a bracket sequel You do right here And this and all this pipe character I think in some other places are so busy and and pipe right and sequel What you say And Andy what is what I get it this and listen and then bye Or your character or condition Water No You have to use our distinct pan port Let me try I mean I were I don't know I just that is work So try this This may take a time so I'll give you the rough idea for the answer Right So you will have a low C for sure Look And you will first have a square bracket right on in the square bracket You will have us our club bracket Here You will say watch the first filter miles filter He will come here on you will say And in the second bracket you will have This is in Yes in Daddy and New York Ah whatever right Even I don't I just have to try I mean the answer right I should see if it is coming So give me one moment I will just try and answer So somebody who got on and said I mean then Oh it ain't so People actually guard Answer That's good So that means my question is getting at least I thought I was the only one who will get some answers That's what they can have My fencing are no Here it is And hammers for condition only in filtering conditions So panda supports normal height on the hair and share or will be a fight for filtering for riding conditions Oh I think Okay one more small thing Uh see one small thing I want to tell you is that this bracket and this bracket are different Do you understand Right This bracket on this bracket is different Wait Ah oneness A method you're calling the other is just implementing too and condition So So it's not like wherever it doesn't work you out of the package Okay Okay Saleh Government more bracket Maybe it'll understand this thing Oh it is working But this is for readability as well Like you're saying that this is one But this is different I'm saying like this is the stillness actually here Okay Somebody was asking a very interesting question and I wanted to answer the question right Can you guys have a look Att What I'm goingto type Well why That is only four There's only four actually Why I'm getting ready for What did he Oh I wasn't listening What is the spilling Yeah Thank you Yeah Uh mm Oh I do this That's right Now I have a very interesting thing I want to show you So I created something called the F three Now a lot off You were asking that Hey there is a problem Because if I look at the FT Index is starting with 28 so you can do worry set You are really setting The index offered a doorframe Ah but a couple of problems What are the problems you identify Ah so venue venue It will reset index It will add an index column but this column will remain right Also One more thing Very interesting Look at the F three No change So I didn't reset Index Okay So what happened That is an index carla matter I'm happy but problems one The existing column is their second problem My original data frame doesn't change So let us solve the problems one by one Okay First problem I'm going to solve is that I want everything in the original later thing So you can say so This this argument is very important in place equal True What it means is that changed all regenerator All right so I'm saying that recent index on the F three on I want that to be only a free So if I run the of three what happens now that the after exchange right or isn't a later famous changed So by default it is in place falls Meaning it will not change Now what is your problem Money It was an exchange No I want to change it Right Because this is my field outward in the filter or what I want in Texas from zero right I don't want this I don't want this I want this So that is why I'm saying in place equal to prove precept it Now The problem is this is any way that I want to move this way It's not about So what I'm saying So guys pay attention No So I mean this is a bit confusing If I say simply reset index right What will happen It will reset the index but the original data frame will be unaffected So what you can do I can save it Has another one right Are you able to understand what I'm saying are able to understand right But I don't want this because I wrote a filter The output off the filter condition is near three I want d of three to be my final answer I don't want to create another day of actually So what I can do is that here itself I can't say what in place equal to true Oh sorry Uh sorry guys I can say the f three I want to reset the index in place Equal to true means my original later frame reset will be Index said we said But again my problem is that this 28 34 remains so I can say brought equal to true That is your final answer It will drop that previous index Yeah so this indexing will be really confusing so I will probably grow That's easy right Rather than talking about this So what I did I had a date after him right I have a date huh That's only three column What is the index defined Like this night huh On let's say here you have a column called Name and you have a corn on Corn Age Okay so here you have some dice A B C D E f Huh Agius 123456 I wrote a field that where I said I want a JJ greater then four So this data frame is called uh the F This is the F What is the next off BF zero toe fight I gotta fill age I want age I want to apply the filter A straighter than four right on this filter will be created as another date offering right The F one Because I'm saving a cemetery Nothing So if I visualize b f one what will happen How many will also be there Left Four Is this clear So far right So now forget this Forget it because we're doing filtering now This is my data now My problem is I want to access rows and columns and everything but my index is starting from 44 are on Warned this I want to reset Right So that is where you say it be If one I will say the F one Okay Doc Reset Index The problem is if I run this command it will reset the index But the orders in a date offering will not be affected Then what is the use right I want Do you want to be my final output But if I settle the set index it will not change anything here That is where you're saying in place Equal toe This means this means change the order General A doorframe So it will just removed from here This will become zero so become one But the problem is now glad One more call And for your four and five you had four and five year right It will come as one more So this is a world index This is new index So remove that you can also say draw People approve meaning tropical it remains It'll drop this car This will be a final day Well it's a bit having complicated understand But that is what is happening You are resetting the index on removing the oil index That's what you're doing All right so we'll see these things when you work sometimes and sometimes the requirement will be totally different because I don't want to touch the originator thing Then it is fine I say if it has another later frame and work on my things DFL any very Manhasset ease Because after fish there this is my new data that I will not affect this one Democrat Yeah There can be many places I mean s So this is where I want to change But in many cases so ah use case will be that I have a date offering which which is used for multiple purpose normally one purpose So I want to apply a filter or reset index for myself but project or whatever I'm doing But maybe somebody else is also accessing the data they run Want to shift index Probably so I will keep it as it is So I would always say that either really this is a pretty tough for him Then I can do my activities right that they put on the use case what we're working on uh in the you know point off time I'll just quickly show you some more things Um all outputs clear So there is a sorting Okay I don't want to spend a lot of time on sorting but basically what you can do is that you can do a sort of values by the column you can say as sending false on head So it's very simple right You're saying that the Carla Miss miles on ascending is false and then head on You can also soared by my tipple column So what are you doing here You're starting my start and Miles can you take a guess how this will work If you want the sword by multiple column first epic of the start it'll sort them in Peru True is ascending order ABC re with each It will do uh Mile sorting in descending order So I'm basically looking at all the troops in alphabetical order with the highest length first right Because this is descending right So starting We will see later I'm in up as off now you just need to understand how to sort So you look you look it here I do a head off friend e s o c It is starting with Agnew affects a right and within Agnew it is again starting 4.32 point for 2.2 Like this fight for everything they are for s o that iss One thing I want and probably we will look at one more thing then we will finish Okay in the you know times that we have like I will I will look at one more thing because you see this is very important conditionally adding columns So you want to add a column now Adding a column will make sense if you're doing some analytics right For example Like I said I want to find out all the trips I want Apply a condition So there are trips and I want to categorize all the trips less than five miles says short trip fighter pen Medium trip Anything more than minutes long then that I have to add as a column now for adding a column You can use any technique that you want You can use normal by tone court for a condition but hear what I have done I imported this library called and number We'll talk about this a little bit more so you will say important by as MP and this is how you add a column So you are saying that I want to add a column called Mile Scat Okay there is a very interesting function called n p dot where this is like a fella's condition what did this It is saying that if Miles is greater than five market us long trip else short Now you may think that what if I have three conditions I will show you how to do that right But right now we have a little conditions So I'm saying that Look at the mile scar Remember this greater and fire It's a long trip If it is not it is a short that's all I'm giving fight And if I run this I will show you what is going to happen So I'll just run this Probably do a handoff 30 Can you see now there is a new column right Called a long trip So that is a miles category column so each it will apply this condition and gonna say Longpre for short report So if you're having a cook if Els thing this NPR where is very useful What if I have three conditions so I want to categorize toe short medium and long You can go one more in there I'll show you how to do that You can sort of like him and it and say that you know and it or or it and say that this condition doesn't match this condition and this condition So my table conditions are possible now I don't remember exactly but I think we can do this also Um let's say that this is not really ah good thing to do So let's I want to add a column Let's call it as my Yeah I can't say and be dark I mean this number has a lot off building methods So there is something called the number I I d So right now what I did I just add a little person So what will happen everywhere It will get out of this 2000 But if you're statically adding then you need to pass every year because this has how many columns 1000 garlands So I knew the past 1000 values here If uniquely I need to add in then Pierre E Right now if you get only one values just going out just added everything I mean this is not useful if you're writing a year but some common thing If you're writing you can just say I push one number it will populate that corner So now we have an extract or them Actually you can also drop according Well see this is hiding According You can also drop According will discuss and pine Okay so the actual given era Let's see 2000 coma That's a 2010 right Yeah that See That's what I'm saying So if you're passing only one it'll just replicated anything more than burn that equal number off Those units mention yeah or unit right Of course saying that Really the rose and then applied So that would be more complicated but we'll discuss NPR is NPR is actually use cases Not this This is I'm destroying an example in Paris on umpires are used for a mattress manipulation And we will see that tomorrow Okay No no no no This is one condition where I'm using and Peter there because Or I can also write a normal function So that is also I can write a function normal fight on function on then pass and get your way Which one Why you No So this various inside and purely in find us We don't have a rare condition actually So this number I library only has this where matter Pandas doesn't support ofher matter condition right or other ways off Adding column is that you can normally save You can have a filter and say that the output I need to pass it as a column Okay so I say feel better than some that is also forcible I just thought of showing you this NPR bare metal because that is very common because most off the filtering you want to do any fellas kind off condition bite If some condition is less or more we wantto and there is also one more method I will show probably tomorrow It's called Apply There is something on a plate Apply is a function in our data frame where what it can do it can take whatever logic you are written on apply to a set off columns and then manipulated So I will show you a couple of examples off a play anyway But I think we are good at like basically reading a file into a data frame and at least a six election indexing filtering Then what did we cover Be covered Sorting selection filtering Yeah date format or so basics off how indexing works right How to reset it How to save Esther data frame Well then when you actually worked more methods will come Right now let me also see what we need to cover because you have to goto parent vampires Okay Um yeah Since you have asked So what if we want to do this right in class assignment Create a new column with the following conditions I'll give you a clue huh I'll give you a clue but think if you can figure it out how to do this right I will write a crew for this So the clue is very simple You will use to wear and tear out their conditions in the first NPR bear You will say if the distance is something you will market this long trip Ellis what will happen else The next MP condition will come right first in Peter Where will say if the distance is more than Penny this long trip Okay Aniston extend peak condition where it is medium and short I'm getting that this go to NPR where unities That's what Boston Peter there will say Long prepares more than 15 Ellis second and be in that he would say medium and short Yeah yeah nesting sort of even I want to write But But I will give you some time Let's say if you are able to do something So are you still afraid off my turn Like so you were very much afraid off my turn right This man is it Well I think it's not like you're difficult to understand Back West Okay I also wanted to tell you this is the marker Can you search for this book There is a very good book In fact an excellent book Um it's gone What by Thorn for data sense Later signs by less McKinney I'm not sure about the spelling I think despite unfortunate name is fighting for data signs his fists McKinney Is that a book like that You'll find a book like that Go blossom Sorry Sorry Fight on for data analysis nor dart data signs data analysis Right My turn for data analysis by your best reckoning So that is a very good book on if you really didn't like data seems and pandas at all night And if you have anything against me you talk to this guy He's the person who invented later frames and Brenda's That's making me writing in this case really turn Although I mean my people pupils were in If the prevalent is more than five we consider that as well as the you know as a medium But the fact is that we can we should be writing less than five short trip and then you know writing the you know first what will happen It will check if this more than 10 That's Ah long trip right Anything less than 10 What would happen Explanation will come Okay so it is A party happens I need it If it is a little for the meeting because this more than five right it'll not go too short Right Well it I don't know No I don't think so That was a question I don't think I know right Same night I don't I just wrote you guys also check it out for this character and I'm comparing Don't believe me Whatever I'm so best McKinney is the person who created our data frames and pandas actually and this book is very good If you are looking for some reference to read actually and a very good book so you can probably get a e copy or something on Somebody was asking how to learn Right on Right on Dhe I gave a couple of things First thing is that you have about two months time because tomorrow your statistics will start next among others Also statistics then really You're amenable Start So this is fab right March A pretty only thing Emma will start if I'm not wrong I don't know whether I'm a will start along with statistics then next month Still you have one month's time right I think that should be more than enough I would also be sharing some practice and assignments There is a notebook all basics off Right on where it talks about What is the list for this addiction or in more detail I will share that Then some My statements and solutions for pandas and data number So tomorrow what is the plan I will spend one hour on our data frames Some more conception is to be finished so well roughly spend an hour on data thing then one hour I will spend on number now by The problem is I don't have data like this See for data frames are easy because you'll get some Hobart data but when you're teaching them pay you don't have any data So like one sense of rose or five something have to create the real date I don't have so that part You may find it a bit difficult or isis statically create the data because the real applications are different You need to spend some time or numb by so probably an hour will spend or numb by on Then we will spend some time on visualization uh Seaborn and Mac lordly visualizations Even if you can understand four or five types that's enough Lord off Individualization is But even if you understand the basic 45 types that's enough for you And then if I get time after that probably will get I'll show you some data extraction methods if possible like collecting the data from an A P I so so far in the activities you know the Heldon selection then filtering adding a column and then sorting This is all we have been And now two more things which are important in case off data frames one is coiled on operation called grouping grouping The later this is sort of like something that everybody want to do group and then apply something on after grouping there is something golden applying my third and data frames I'll talk about what is this So this grouping and this apply these two things are actually what is spending All right on what is the idea off grouping right So normally when you say grouping in the statistical terms this is called S a C It stands for split apply and combine split up like on biting So even though it's a group operation we technically call it s a split up like combine operation That's what we call Why are you calling it a split up like combine Because when you want to group something and I So you remember the Wilbur date All right so we have the operator So one common situation is that I want to group all the rights based on the start city Somali civil rights are there Right So my intention is I wantto group it based on the starting city so you just don't want to group it Right So my splitting Kahlan will be you know the starting city start location group on then probably from each grouping location Each starting city I want to calculate the average distance traveled so that's hilarious New York There are 20 troops from New York All right I want to know what is the average lentil for trip Right So then you're applying Colon will be again The mile escort on the apply function is going to be mean So basically you're saying that grew by the start column right on Then take the Miles column for each start location say New York So my scar animal have let's say 10 Crips 10 different distances for New York Then apply this mean function on them So three activities You do extra dancing When you say group the data first to take a problem Very want to group it on Then you find assorted So that is why it is called split up like combine Then you take the next corner Mary want to apply some function In this case I'm interested in the average distance traveled So I say mean or I can also say what is the maximum distance traveled They can see Max or or any function that they weren't But I can write my own from Kim and apply the arrows That is also possible But But this is the general logic that you actually do for grouping And we can actually see this Now if you're running this notebook today so the same nor book we're using It started yesterday night What do you need to go Since there's a fresh day you need to import the library second Otherwise ignore work so you can just crawl up right on You need to first import pandas a speedy on Read this uber later on Go ahead So do this once you important do rd of your head and you can see the data Then you can scroll down because these things we were covered on if I scroll down again sorting Harlem's grew by So what I'm gonna do I'm just gonna insert a cell about okay And how do you do this same thing the F dart So I'm gonna group the data So the function is called Grew by huh And I want a group by this start location Okay so it'll start it on dhe Then what is a column where I wantto calculate something that's a mile school Um so it's in my star on Then there's an aggregation function so I will say a G that's my aggregation function on dhe then mean So this is the general way off writing a grouping You will First they grew by the new pass say one column You can also passed more than one column So right now I'm saying that I want to grow by start simply pick up all the start See Agnew where that affects that'll be something Okay Austin So that they'll start calling for each start I wantto calculate the average off miles So you say miles not a G And then you say mean so this is called the aggregation function This a G there you can say what function you want to call it Where does that litigation logic you won And by the fight I think depending on the pandas version but it supports mean median Min Max answered another mathematical operations on If you want more you can also use them by along with this But as off No just for a simple calculation This is how you find it out Now what is also interesting about this method now it depends on the pandas version if you can find this right Mmm You can also do this Look at here So this is a bit confusing So normally when you're doing an a g aggregation you're saying that I want to calculate on me right But in simple terms probably I can also do this I'm guessing Yes So since mean is a very common operation Right Average is a very common operation You don't have explicitly say aggregate and then do whenever you just used to place a mean it'll do me right But But the common format is you say a G and then you say mean within that that's a common for night But sometimes if you simply say mean or so it will work Yeah um for this So what did This is very simple It takes one location This is Agnew A word of it So if this has 10 clips meanest every right what is mental It'll add it and divide by the total number that is needed It will pick up the location first Agnew So you are saying that grew by start location So first Agnew let's Agnew has spent trips right on Then it'll pick up this court on miles So all those 10 perhaps will have kilometres and miles Right So 10 different kilometers it lad or those 10 divided by thin That's some inventive Okay now I have a question for your two guests You can try but don't try You can try and give me that server Don't try Take a guess Okay What will happen if I do this I just have a small question If I remove this Miles column right I'm just saying grew by start Just mean Do you think it'll work Short note I really all right but it works actually right I'm so this is a game So these are some of these short cuts that are there So this is so what is happening here is that you are saying that I want to grow by start location So you pick up start then you're saying that I just want a mean mean off watch Well there's only one in the corner said Cal Place I mean but if you have my table in Metro another saying that already low both in some occasions because it doesn't know So for quick analysis what we do that if you're having the data where there is only one in desert column everything else history it understands that you cannot have a mean on strings You say me and it is You're probably thinking about my school Um okay I'll give you that Miles Karl Marx So some If you see this your toe understand that the rial query is not like this The real career will be in Miley's daughter Did he mean that this short You're sort of like a shorthand notation you can see represent So sometimes for uh you know go would be right like this Just take the d f n Show me the mean so it doesn't really care What does that mean So you can just grew by start enemy very easily We get same same out what they will get You are interested in something like this For example let's take uh start and stop on Okay so we have started and stopped right Is that a start Fordham And there's a stop Cora Now I want to go Grew by both columns I want that is also possible right I want to go Grew by own start on grew by own Stop Okay probably here I want to calculate something like a mean or something like that on here I don't want a mean I wantto probably calculate the min or max or something Night So the common situation is like you have the lights data on I wantto group the data based on these start location So for each start location I'm interested in the mean distance traveled Okay And then I also want to group it based on the stop location where he was going on I want enough They stoned the stock location which what was the longest er so if you say max that is typically like the longest or men can be the short district So if you want to combine them that is also possible So how do you write It is It's actually very simple because again you will have your grew by column But now what you will do is that instead of passing one corner where do you need now You need to call him sites Are you past two columns Yeah Within Walk with analyst side So you will say Start start then what I want Stop Start is a stop Stop Yeah or let's do something interesting Probably you want toe group between start and stop Okay on Then you want to calculate Let's say the mean and some probably Let's say that is the idea So what I will do everything there dot a g so a d d It's my aggregation grouping Columns are start and stop Age idiots My aggregation column What do you want to do You want to calculate the mean and some so again who values other So how do you pass to values You can say in a list on you can put them a strings You can say mean and then you can see some right on Probably I can't door dot head right Just to see Okay this is something very interesting I will talk about this a bit more later Like how do you read this data right But I'll just talk over Oh by the way there was an assignment Let me just check So can you make some sense from this output What are we getting I saw I saw I showed you the query I wrote I sure do the output or so what do you infer from this What does this mean If you fear customer is asking what is this What Will you explain How the thing is First it will pick up start Okay then So let's say Agnew then what will happen Yeah So for Agnew then you have stopped locations right From Agnew You again have stopped locations to very have traveled Right Then it is calculating the mean on den the sum for each right But this looks were right because the columns are here right You look at the columns Do you find it here Yeah it is here because this is a nester column I'll just come back to this in a moment I'll show you If you do our daughter columns you can see that these column headers will be ness ness turd So if you're presenting this in our nice for matter Something if somebody looks at it they're like OK so what is what you mean by this You know if it's miles then mean then stop What do you mean Like so you need to just reshape the columns I'll show you how to do that But basically what are you doing You're grouping by starting and stopping And for each you're calculating what the mean you have And then the sum you have mentioned Oh I have not So it is automatically taking minds I mean since we only one column it'll pick It s I can also Adam ice manually It will read it That is why so previously we looked at it right So if you look at here this command is it What did I write Oh I've deleted it right Yeah So if you're having only one column by default if you calculate a mean or something it would be by default Pick up that column But if you're having more than one year to say miles so yeah but yeah that's correct So ideally when you write the cardio to mention that you're Carla Miss Mice right Yeah So they need to save its column you want So if you have more than one Carla then it will not automatically So it depends on different pandas version So in one of dependents questions what it does is that it will It will do that for both So if you have like four in the columns on to float Cornyn's for all six columns it'll calculate that is also not something you run one you aren't only specific order So you have to say it grew by then The whatever problems you want then in Squire list your present this column I want my But anyway if you have understood this I haven't in class assignment for you right Look at here Can you Okay so this is against everything right You're doping by start miles then mean And here this is already we have done right is the same thing we have done on You can also do like the story start location find the mean on the total distance Traveled for example like this So this is slightly only start location we have In my example I calculated start and stop this example here you have only start location I'll show you So right now the labeling is with your Because if you look at here you know I actually have only four columns The 1st 2 columns are fine But you know again here you have sort of like an index columns that we can actually do one thing if you assign this to our variable If I'm assigning the stewed ago and he fired with ago dot columns and what what It says my P index levels andan labels So basically what is happening is that it is nesting It is actually in a stimulator I will show you howto Ah you know remove it and add proper column headers But right now can you try this in class assignment Find the most recent on the earliest travel date and mean distance travel for each star city So like in every class the assignment it's more complicated than my explanation Usually right But before you try this let's try to understand the problem right So what do you want to find The most recent on the earliest travel date on mean distance travel for each Star city So let us break this down What will be the group in Gordon first So you have to say it grew by Start Connect on then So this is a first level Then once you group it you want to apply some logic Whatever it is that is on where What is your you know aggregation columns that you're talking about No no dot a g e d S s o That is very applied A function Not that On which columns you won't apply the function Find the most recent and earliest travel to watch So you have who columns One will be bait One will be mice So the grouping will be Start right then the aggregation columns Who will be there One will be bait One will be miles huh And then the function on miles What do you want to apply I don't know What Er uh Eat mean distance So here you're a g will be worth aggregation Function will be I mean correct on date What it'll be So we want to find out The earliest party said earliest on most recent So we can send Min and Max If you have a date you can calculate the mainland Max Min date will be worth the first right Max rate will be worth the last right recent Right So here you can to say aggregate You will have mean and Max So this is have you break down the problem First you want to find out what you want to grow so that is for sure Start I won't go from eat City I want to find something right Find Own what column Right So on date Colon and Miles Carl Right on the Miles column I went up like me on the date column This part But don't try this Don't try Do There is a trick This may not work If you write in any way sway I mean the logic is correct but you need to do one small thing But I'm not talking about this My third There is one small problem Because if you are what if you want to work on a date This is this data string It was a beauty dot convert to date format I probably will help you with that because one more problem is can you look at the last row off over earlier stand recent earliest in recent music for stripping last year right Yeah that is what I meant harming earliest endless and will be saying more word No What It's morning in my earliest trip First Didn't really central last remaining Max right I don't Maybe my English is bad Yeah I want to find out The first happened last trip That is the logic Basically Right So So don't do this The problem is unit the first convert their date toe The state called him toe the date format because it'll be string How do you know Because do one thing Are you wanting create a new column and just to uh or the D A office over this thing Right No it's an object for sure Start date is an object right now If you do it the f darch pale out Now you have Ah So what do you understand from this I see the last roll You don't want the last Roy's work something we run one So from the data frame first you have to remove their last row Okay So tell me how do you do it B f equal to the f dart So why don't you use I Look I want all the rose except one So the first task is that you are removing the last row So you will say the f word I log Colin minus one That means all those except the last one Then again calling because I want all the columns are And I'm not interested in the removing any column And as I in that back Odie if because otherwise that not effect And now if you do a tail you can see the last row is remote actually right now what you do gotten word that do Look the f off Um start Porter the cordoning make Starlight Date star You quit to be dark today it's dying d f off Now you can think about the logic So now just see if you can build it I will also help you I mean I'm just giving you some time so probably if you can try it out You see Oh while writing the statement right This one should work right Which one The I look one Are you sure That's exactly how you typed it Because yesterday you did it right Yesterday we converter discord Be if this isn't list square brackets equal toe Peary to date time There is a bracket off the f again a square bracket Is there anybody who is not able to do this daytime thing way We're doing that yesterday right Whatever you're getting Nordea Simon I'm just talking on labor converting to this Forget the assignment Yeah So some of the problems might be you might be using a very recent version off Bandas Maybe so if my lab you know sort of like removed a watch support for my log You What So this I look if you don't want you can use low Corso What is your problem Because grew by corner Miss Fine We're grouping by water that start right then The condition columns are a bit tricky because you are saying that I want to apply Min and Max on date and then mean on miles Yeah So then how do you differentiate Right So that iss you need to use a dictionary Day said it's slightly perky because previously I'm saying only one So if I only one condition I can simply say say d d call average I will do it But if I have multiple conditions because I need to say for a date This key I want to do these two operations because I want to multiple and for you know Miles I wanted only me Say five Try to write it Let me see if I can write it Then I will show you So I'm gonna assign this to something like rest So how do you start Can you help me The f dart Help me stagnate like stock Stop knows no start date right Who said start date You're confusing What is the sort of name I don't I'm just asking Then you will say what it should start dating It has to stop Stop Start date is work the date I'm grouping by Work the location Right You're grouping from brother Propose starting Right Not from there The data starting later Such a B start Stop Okay I just looked at here starts stuff And now what is the technique that you want to do You wantto pick up column and then apply the function right So I should be doing an aggregation Right Okay So what I will do I will call him thing Yes So normally when you're right you will say list off columns and then a plight But right now I cannot do it because I am having won Carla mental conditions than another parliament One condition So I need to represent the whole thing inside a dictionary So I'm first saying that I want to do any G okay And I will open a dictionary Let's let me see if it works Actually I have not tried What are the columns Weirdo for aggregation Aggregation is a function like Sorry Ah you will open a dictionary on Then you will say miles Right So within this I will have miles start And what do you want to do One miles So you can say what I mean I can either pass it like this or I can pass it us a list Let me write it Okay Then I will come back to you on what is the second thing Start date started A star Right And here what will be my values What you want to do I mean calma Max Right Um you will write like this You will say a group by Okay So normally would say it grew by some column Okay And then you will say I want to pick only one color Okay Except call them too On the news there dot org And then you will save me This is normally have you right But here my problem is first awful I have multiple columns So if I'm passing a column two and a call on three Okay How do I tell that for call Um do I need to have two things to be calculated So I cannot write it like this So that is whether dictionary comes into picture So they knew writing a dictionary How that you know syntax will change You'll against a group by this will remain the saying you will say I want to grow by this column You will not directly call the column You will say I want to apply an aggregation function So this is my aggregation function on Then you say what functional Which call of inside this So inside of this thing you will define a dictionary This is key value pay So here is a core long one And what are the things I want to apply corn And so what I want to play So can you try this and see if you get the output Yeah way But this is again a bit more lengthy Court if I have only one called is easy to write suggestive a off writing So if I have like multiple conditions that make sense it will work even a single element Or so you see I'm passing a list here I can pass a single column here So here I am saying mean it's a list I don't have to say list I mean why I'm using it Because I want to just maintain the same way that I'm using here So these are some off the what you say on one thing which is very confusing is that even this is not something again by heart Okay in certain find ask questions If you write something like this it'll start throwing out Are saying that you know and listen the sinister syntax there will be no editor But you know there is no generalized way to say that is the only way to write it Vital comments in taxes like this But depending on the pandas versions sometimes what happens It'll say that okay list or dictionary that you're passing here the proper format has to be there So what we normally do is that if you're having a single column you will always follow this matter You said grew by andan the column on Then you say aggregation But if you're having my people columns on my deeper condition on the columns So here the confusion is that for the start date I need to our aggregation functions That is where I cannot say Stardate Off Min and Max That is where I'm defining it as addiction or otherwise I do Oh my in the aggregation you're saying But here inside this House of Representatives addiction of it So it has to be a key and value So I cannot say mine's dark mean here because this is a dictionary right Dictionary has to have a key in value So it has to be a key Why dictionary Because the start date has to values So if started had only one value I can say dodge and it should work right for both Then you won't even hear a dictionary You can directly use the previous method right What we were doing that that way you can do where these things will eso what people go I'll give you some assignments on this further than this so that you can understand more about grouping So there is this an example But some more simple amendments or in different columns on day tomorrow you'll have a practice session So there you will have around 34 hours to spend only on these things again Like it's not always like after we finish today The class is not over You will still have some more time to worship and some more assignments and all this I just want to introduce that these kind of things are possible Um are you sure The wrist proper number of brackets Hands in accent Okay because that might be one reason Right Okay I struck so I just assumed this Okay so I have a question for you Okay I have a question to you Now I have a question in the same example In the same example if I wanted on Lee the average miles on the recent report will you I want only the average miles on the latest trip Will you write it like this Well you can but you don't have to write Are you getting the point You can write like this but you don't have to write Because if I'm having only the men nor Max I can simply call the corner man Call it their exit like they were doing previously All right so this dictionary notation came since we had multiple things were passing actually Okay now somebody was asking Ah very good question This is fine But if you look at the output it's like beer right Because there is like all these collector columns here So normally when you do a grouping Nestor columns will appear in your output So right now my data frame is called arias and arias Columns are like very veered So what do you need to do it once you have done this Okay Eso here also if you look at the start date it is unique right There is no duplicate So you're a grouping By what That column start Settle pickup So in agony Oh that I like 11 trips So Agnew's repeating 11 times but you see it only ones it grouped All right But what you should also do is that you should worry such index can you tell me Why am I doing this Any idea huh Any idea Why am I doing this So look at look at the corners Now I'm gonna turn it again So when I did a reset index what did it Previously the index was this act new element and all That is how it came right for netted a reset index It started this 0123 as the index But again I'm not happy because you know these are the different columns I have so I can simply say Adi iss docked columns Um equal to So what are the columns you want first will be Stark City What is that Can go home huh Average distance right What does that mean Right for the third column is earliest first trip right Nor the re sentry Where does that earliest trip then what We had recent trip Please send clip happy This is what you should do Because once you do a girl grouping these columns will get nest stirred So you near Italy Remove it Just add an index because there is the super Index So this will become your index and then you just say what columns you weren't So now you have a proper date after him In this form I you can have your own problems You can work with that later by the fall the next worst axity But I wanted I don't want that to be the index Three sixes It will not when you're grew by in doesn't sell it The index message indexes only first election after drew by the output that you get is based on which columns you selected for grouping So buy it If all that will become the next caller that is sort of like something So if I'm removing these columns right let's run this again I don't know Let's run this again See this So why does my index now start stout This is the index starts toe If you warned that will be the next That's fine Maybe my requirement is like my index problem has to be the start column That's okay for me I don't mind but a better option will be to remove it and add your normal index 01 toe And then you are in the restaurant removing the column names and giving your own names So just say whatever Carla means you it is Yeah So this index is actually the rest of our operations where the index actually comes into picture for example you can take an existing column and editors and index So in those cases you want to call based on the next number you can do that That is a little yeah it's not mandated So the newly set index that he becomes a ronin standard and mixed But in say so let's say there is a country column I can't say the country column has to be my index collar So my index So what Our values are there in the country that will become my index Then I can operate horizontally also So usually all the operations are particularly in a data thing but there are horizontal operations Also I can't go like when you're talking about loc lock right What if I want to use lock in the horizontal way So let's say I got the data from 150 countries right and I'll order in a day toughing now that will have a horizontal index of 0123 Right now I want to do some calculation where it is horizontal So then I possess select 100 Select 200 I don't know what to do So if my country column is index I can't just sit there like Japan than do something like so that for selecting it might be important Otherwise you can leave it as their own I will show you an example probably We have ah sale state example where indexing might be very useful There's a sale state example where the indexing might be useful list It's similar limits Dictionary is if you have a key and a set of values If I'm using like if I'm saying that I have three columns on I want to apply men Max on each column Select So you have a three columns or eats garland You are playing one aggregation function Then you can use a list So yeah so I'm saying that I have that revenue I have a new record Um okay then I have another column called sales Then I have another calling card profit Three columns out there right now on each column I want to apply some sort of a mathematical function So what I can do I can pass all of them as a single list I can say that haven't you Sorry Where did that sales and then profit So this is a list because this is a set off columns on my aggregation I want to do I can say I want to do aggregation again I can pass That's a mean If I simply pass only mean what will happen on all three columns mean will be calculated All right If I want all three columns to have different things I need to pass mean Let's say Max that same in like this Oh so this is two lists actually because on each corner and I'm saying that this is what I want to go But now my requirement is that on revenue column I want to calculate let's say Min Max and ah me three things So I have a revenue problem I want to calculate the men the max on the mean three things on the same Carla So then what I need to remind you to use a dictionary so I will say revenue is my key Okay on the values are I can pass as a mean max and mean listings because you understand right One is a call on one is a function This is what a column that you're operating This is what ah function you're applying So the question is that Are you applying one function on one column You're just interested in playing one function One column can pass the list off columns as a list on the function or so has a list it will apply Okay Otherwise you need to use addiction It exactly Exactly Exactly Exactly The list is like you have 1 to 1 mapping list It's 1 to 1 more thing so I pass a list off columns on the list of functions 1st 1st function will be applied 2nd 2nd function replay like that But on the other hand I have 123 columns for each column I'm doing 34 functions separate I need to say I need to map them right What function should run in Which column So there I knew he was a key value pit So the idea of a key value Paris air for one key you have three values So all three things will be applied for this in this example All right so if you're passing as a list one no one will be applied Yes And profit men will be applied Exactly But otherwise if I have my people I'll say dictionary is the key These other values and they're not like Yeah you can drop me Did right Well we didn't do okay but if I drop it here then it may not be good light because my intention is to keep index starting from zero So in the original Okay So he had Ah give me one moment Just come in This So here you have the grouping from here We wondered Robbed Index on the building Yes Our audience is here Yes Yeah the nexus Start Stop This'll saying Okay Okay So you're saying that only after this you're able to add the columns right now Even without that also you can do What do you think he is Which one did you s So now you have our years Or now you're saying I will you be able to do this You are asking you Let's try it No that cannot be So what Anything linked mismatch Expected Access has three elements New value has four elements So what It actually stays Why Dissing the value has four elements Okay let me show you this Um uh What is that Arias Light party Yes Dodge Obvious dark forums Oh I need to Exactly So let me just put in this Okay Uh so when you look at uh this uh mighty index Right So 123 So actually are having three columns in this after grouping Even though you see multiple columns it is all Nestor So if you look at the you know number off your limbs you have here you have something called levels It is speaking This has one column then this Max mean minute is speaking as an investor column on then this labor cities picking us another corner So you are having three columns and you're saying that I want Oh you know I have four columns for my data from that is where Dave is saying that value doesn't match So you need to reset the index You're to first say every seven decks so properly ad and index Okay And then the corners and then only you can know removed So did you guys understand this after you group it after So this is the grouping result Right after the grouping If you look at the columns these are record Um So how many corners You have Three columns Even though displaced like in this fashion If you actually look at the columns it say's my P index column So this is not your normal corner This is not your normal cord Um it doesn't indexing So it is saying that on one level you are having miles and started So what was the grouping column What did we grew by start date right So start date is there Then you have what miles miles is watch very recalculating the average right distance traveled Right Then it is saying that you're passing this max Mean men These are the operations you were doing on date And the other thing on Bennett is assigning some labels or so for each column So when you look at the core limbs off your grouping output you will see that at three columns on dhe I cannot directly change it That is what I'm saying That I want to reset the whole index off my columns So when you run reset index what it will eventually does that It will add a column for index starting with 0123 on Then level your columns on Then you will be able to add that my custom car running water What I like No Which one Let me see So this is after leveling Right So I will go Where our Yes Reset index Right now you fire on arias knife Iran Arias You have this index and then these columns 1234 columns So you want to drop this You can say drop in Dexter's airdrop index But then what will be there doing days Okay so after you're leaning So let's say you are adding this column Right Let's say we're adding these columns than any way that is gone right Where is it Okay from here You want to remove this index right I don't think that it's possible No because our data free must have an index If you're removing this then what will happen Your Star City will become the in next Carly Now there is no way you can So that is in the first class I told you right Data frame is not like your Excel sheet in an Excel sheet You have only columns You don't have a row All right he fired Remove it or drop it How do you use I look on horizontally or local horizontally How do you access a role Not possible right If somehow I like Sam dropping the index Okay then how do I say I want only the third column or fourth column Not possible Right Knock or bits of it No no no She's saying I run wanting dicks at all I want to remove Index that is not force of light Wait so this I can keep my index If I want I can keep this es mi OK so she's saying that I want to hire this right I want to remove this So I don't think that it's possible because you have called him in Texas and growing mixes Call I'm index A circle of names Ruined access by the four start from 0123 If I want I can take a courtroom and say that I won't distribute or index but whichever way I mean it's just needed right for accessing so I don't think it is possible to completely remove the index Then how do you say that I want the fifth row There should be somewhere to say that I want the fifth row lie So this number has to be here in one week I have another data said probably I will show you how you can select a particular column and then probably make it as an index if you want So I just want probably this Carla my spine X But one more thing I'll tell you uh you can try that if you save this SS C'est refile Do you think this is Robin to three Welcome I'm not quite sure about it Can you try How do you save it You say beauty 0.2 CSB USCs tree fell Can you try it right away Create this You have this row 123 Okay And then you say beauty dot to underscore CSP on in the brackets in double courts UCSB filing the only ideas that is only one which is spending right So in fact directly If you do like this it's very difficult to access the rows and columns because they have an estate right That is why we have saying that had an index and do proper colonies Which one That is in next column Now you're on half So it is picking The first column is index column Now let his way So the news said Carl um city not usually show that right So in this data offering that s the So you can try to do in my local lock on that coordinate give you so if the requirement is like that you can keep it like that I want that column tohave my index problem But that will not be calculated in my actual data that will not commencing index column is not part of your data right So that it is not showing right So in this example that this government in order the first column is like an index card That is why you're not seeing it But that is bad because I want that in my data back That's why I'm saying that we said index because I want my own index card Um so this will be pushed aside Regular column starts trouble Come on Regular cola then I can just rename and keep it for me huh So can you save it doesn't see us me and tell me what you see Not taking it Should take Right Uh let me run this Okay I commander this okay I forgot I think it was two CSB right Arias dart It will take right Can you see my screen That is a good place if you come here So this will be your workplace workspace So this is your rule directory and just download it So this is my C s V And if I open that let me see what we have Um oh I can't even open it No no that's not a problem of the fighting index Welcome And no saving it So this you have to uh you know here you can probably if you want you can hide I'm saying I'm saying in the in the data frame perspective I may not want to hide because I want to access them So I want to say from five or 10 I need the data So but in the X element I get I don't want this column because I know how to get their time Except so I can either delete it or hide it from Except when you're saving it Finally even I'm not 100% actually if you're saving it probably there's an argument removing Dixon saving yours probably far right It'll remove it Right There is an argument So when you're saving it But when you're working in the data frame your own want to rebuild index because then how do you access the rose That is It's not possible right What Yeah there is no nothing So this index was added automatically when you read it out loud In order you can try that Check it see how the data looks like when you save it for later Right So guys in the end of time I'll move to the next topic as off Now we're done with grouping Okay I'll give you more Examples were also clear Some doubts Off line I just want to finish one more topic before we go Toe nam pie and other kind of topics Right So um looping We will look into that Yes So I have a look at here This is a very simple example Probably You know So this is something that we do We did not go in depth in tow control structures in python like four loops he flips and Roy But something very commonly use that if you have a list like this right I can open something called a four loop So I'm saying that for I in temp So when you say for I will pick up each number one by one I'm saying that our yes dot happened yesterday Shorty right up and you can add something And I'm saying I I and then I'm returning arias So arias is an empty list Basically what this is happening is that this four Lupul pick up each number might apply the number by itself on added to this empty list So if I run this you can also run You will basically get a 149 16 So instead of 1234 each number will be my people It's a simple for Luke Nothing So in Piketon it's very common that if you wantto work through a list and do some action you write a four look because programmatically you can say that for every element I want to do something and this is one way of doing it And that is the same thing That is also something called list comprehension So look at this example So here what we're doing is called a list comprehension The same our technique I can just open a list within the list I can say ex star X for X intent So basically I'm saying that take every element in temp and then multiplied by itself So this is probably the long long way like this court is very lengthy but the same court I can write this is called list comprehension So very commonly And by Tony will see this you will see a list on within the list they could ever reported a four loop That means that loop is running inside the list Go to manipulate the elements That's coolest comprehension This technique See you for as many elements other on the list So this ecstatic for X in tempt them business thing But your original uh work laced right heart temples here All right And what is the difference between our four loop end of I loop I give you some examples but yeah or enough while can become a near indefinitely four Luke can never become an indefinitely It is always 6 30 You will never end up in an infinite loop in a four loop I mean why Look there is a chance I'm saying why is greater than zero then some number then you keep on learning for infinity right now I want you to do one thing There is a data set I want you to upload Can you open your data You downloaded yesterday Uh final invite on files There's a file core store sales Can you see Can you rename the file Just keep it a store sales to remove the one and upload that in tow You are bite on like this So here I have uploaded Can you see store sales See So once you upload it you come back here and you should be able to read it like this So can you read it And this is a very interesting data set because ah here we are having a store I d So you have like s one s two s three upto some 100 stores There is a city for where each stories and then there's like months Jan Feb March extra on sales in thousands So there's like $1000 $20,000 except tracks a trap Right So this data is particularly interesting because here there is a horizontal way off working it And there is a particularly off working it right Like Like you can look at it both ways So basically we're having story least in the cities and in the months and the story that you have so agreeable to read it What is not defined We read our UCS Me You might sort of be important Unless if not do this right for me It is working Okay Renamed the file No store sales or give the store sells one No no I thought he didn't do it so I just added it not required Now first time you important pandas right Then it is not required Are you reading the firelight And then you're going ahead Ok Okay so So tell me one thing Tell me one thing I just want to calculate the average sale for Jan Maan What will you do only for Jan meant I want to calculate the average sale E Cross Dan called Um I won the average so you can say store sales then what do you want to grow by I can't say Sure Blood Right So I'm just interested in calculating the average or the mean for Jan The easiest way to do that will be you can just also I think some summer I thought some of you might ask me but nobody asked What is the speciality in this court Is that anything interesting that you find Well I think it's ready We discussed So normally you have to say store sales dot lock off Jan I'm not using that right s o that is like a shortcut So ideally it has to be what not lock off Jen But you can also suggest jam It'll pick up and I'm just saying mean So what you get you get the mean off all these things That's only the Jann Carl Um right But now I'm not interested in that I'm interested in calculating Let's say the mean off store I d S One So probably for this store I have like Jan Feb March all these months I just want to calculate the average off this one thing Probably so Yeah So store when it Since Tex Ax I just want to operate in for his own deliberate Plus possible right on vertical is like right now we have picked up on Lee Jan Maybe I'm interested in picking up multiple columns also so let's see how to do this So one interesting function that I want to introduce to you now this might depend on our pandas version It's called the apply function so I can simply say store sales thought apply Okay so applies the function Okay And you can simply say mean so can you take a guess What happened if I ran this Yes So basically applies a special function we used on within a play You need to say what you need to go So here I am saying mean what it did was that it pick up each month and then calculated that I mean is there anybody for whom this is not working Throwing another You might just run only this Yeah uh if you're getting this output is fine at raising getting in a red then sometimes you may get another saying that some paradise version this directly But not because Can you tell me why it might give an error So right now you're getting it Fine So if you think about it what this technique will go It will pick up each column on then apply Mean But then what is the problem We have a core limit Is string right So in some pandas versions what will happen if you run This will throw another It'll say that I'm tryingto calculating uniform or corland But the second column is a string so I don't know what to do So I will show you what you need to do that because that might be very handy So there you have a command called select data types So you can say that include only in desire and float Dorn considered string But if it is working fine Then it is fine Now what if I want to do it horizontally right For example right now you did this on I want to calculate for store I D s one mean I mean for the year Jan Feb March April Right So this is what for s when I won all the mean horizontally That is also possible right For one particular store I'm interested in the yearly average Then you can simply say coma access door So there is a concept off access There is an access zero and then the reserve access one Access one will be ruled wise Access road will be Carla Mice bite before the discord Um anyway so if I say access equal to sort out what will happen Gordon silicon by default it is this I can't access equal to one that will be each row It'll calculate their mean But this apply function is not used for these kind of things because this is like calculating the mean that is like very common But you can also have a bit more complicated functions In a polite we will see that Okay Access oneness rope are always so for each store it is calculating the mean off this line That is what you see here Accessorize Carla Yeah So it is very rare that you want to go up Even you might want to add us Yeah So where I need to check whether the hiding a column is easy and removing a column is also easy I need to check because it is how No columns as possible through this one Yes I'll show you how about it The freaky blacks says you controlling the scene Work done for court Yeah access the racecourse by default of this column by default If you simply call it work colonize the apply function works colonize that equal to access it so you won't have to explicitly right access it It calculates But But what I'm saying is that in many cases for have seen when you start working in production probably you will have a date a frame that will have a mixture off string then floor than indigent and so many other columns So sometimes when you simply call the apply function that they were doing like mean it'll throw another saying that I was trying to calculate it but I encountered like they're not with the string columns which I don't know what to do So there you need to use a small technique I will see if it works I'm not quite sure Uh so you have to say mean no So I will show you how to do is directly You cannot Okay here it is just the mean off the whole each calamitous calculating the whole mean or each room Ah so story ideas work safety stream right It is not considering right now But I'm saying that sometimes you will see that you have a string column It'll throw up another saying that I cannot convert In case if it occurs what you need to do I'll come back your question Okay In a moment you have to say uh large select the types to say include old I think you can mention like this Let's say in teacher 64 former float 64 Nice Give me one moment Okay Yes So just make a note off this because this is not something that you see normally But if you're getting another saying that I'm not able to calculate the mean You just had this thing Celik Day types you can say including dessert 64 I mean whichever So this will say that I want only in digit it law forcefully or made the string columns All right I'll show you Just give me a moment Playing huh Aggregated see Apply is a function which will work roll by San Carla voice So they're in apply It is not directly possible to say that I want to apply only on selected rages I will show you how to do that in a G It is an aggregation function So where a GI Bill work is that First you will group it on Then on that you're saying that I want to play a function So this is like in a common data frame I want rows and columns to be done Something This is how you do it This one was like straight People wanted to unit group Then it will not work So that's what I'm saying Apply is something which normally takes a function like me You know something in a place to either hold rose or hold corner columns that is idea I never show you where that actually matters where that condition welcome so grew by and then the function that you are writing That is where I want to group something they're not then you know probably aggregation function I want to write So for that aggregation the laws it is that first time combining or the so can you try to find out the city vice sales Probably can you know grew by what city I wantedto grew by city Then what So I want to groove the stores by city on dhe then want to calculate word probably average or means say can you try to do that What I'm saying usually leaving you apply the apply function If you have string and in desert it will omit in business But in some pandas which little notch it will try to calculate the strings mean and throwing error saying that I cannot find out what is the mean off a string So there you need to say But still you cannot write because the end goal is that you take the value and calculate the mean or whatever you want So if my values are our string I try to do a mean I will not get anything right It'll throw another So I didn't get your situation Where will be such a situation Come can you give me an example where you have a problem name as a Okay fine But that'll be string only right They don't be numerous Uh yeah So now you tell me what you will do It's like using Okay So there is a question he's saying there is a column Let's say the column miss Watch sales There is a sales column Okay This is defined as a uh What In the right Three string Carla this is a string column Okay But inside the values are in digital For example How 3.33 What is the solution I do indeed I'd be just discussed today Morning Cordelia started it Lard it Will it having so in my version off fantastical award that Carla permit that because it is a string corner because it is looking at those numbers But they are within the courts right under them A string string Certainly Be types arty types daughter d types Now they apply function works on the horn All the columns are or the roost So usually the idea is that if you want to select certain corns and do it either you need a subset It OK All you can say group it and then e show you So this is not a play function This is simply the means I was asking You know if you have a date a frame like this Ah play function usually votes on hold Rose and whole columns Not like selector ones Right on One thing I showed us how to play the mean Actually but some more interesting example will be So let me show you this Probably rather than discussing I'll just show an example Insert a cell about Yeah So have a look at here guys Do you know how to define a function in python so normally If you want to define a function let's I want to define a function like ad I will say you can say a coma Be I want to pass two elements Sorry on Dhe then So if you're passing A and B what will happen You will say there is something called R E s equal to a plus B and you can simply return our yes on How do you call the function He was a ad on then you said three coma for so have a look at here What I did I created a simple function It is called ad and I'm saying that there is a envy If you pass a and meet Larry plus B this is have you defined out very simple function in fighting just to show you the syntax Now my condition Is that so Look at the store sales later I have the store sales data right I wantto probably give a bonus So I have stores right there Store ID's on what I want to do that I want to pick up a column actually say the Jan Maan says If the Jan Month sale is let's say more than 10,000 for for a store then I want to give them a bonus So my idea is that so These are the situations where our play function will come into picture So it is not like apply function is used for everything But one common example is that I have the store data on what I want to help The JAN Column on my condition is that if the store has made a sale okay which is more than 10 k in John Run then I will call this story as eligible store I will give them bonus If the value is less than 10 k I will say not eligible Now can you tell me how can you make sure whether it is more than 10 or less than pain Any picnic We did a study of light NPR where right So what I can do I have already written it I think so Then I can save some time Look at here So I have defined a function called Born Its function on the bonus function takes something called sales numbers Okay Or I can say sale school right on Probably loaded as a comprehension Okay so I just liked it in a bit more easy there So you guys can understand So tell me how do you write the bonus So you again say huh And be not very right How are you All right What were the condition So first let's do one thing Let's call it something So I will say I have something called eligible values If you guys want you can write it along with me You will see if we can refine this function So I will say just type the F and then bonus function on dysfunction is taking a column I will say eligible values Okay Where I will say in p dot Where What is the condition when my sales column is greater than then I will say l e j brother or I will say Watch not And it's a And then what you need to do in the function but it turned white Andy Gibb values So what am I doing I'm defining a simple function which will take something called sales column on It will compare this It was a eligible not eligible right But that is not the end off it So I can I will just see whether it was store sales I can just say Jan bonus and I can't simply say born its function And I will pass watch store sales What is a column J and right Some matter pusher Oh on I need to import it right I don't have So how do you say Ah very interesting So what did I do here This is my function What my function does it takes a sales column on it will just stay in Peter Where if the colon values more than 10 it is eligible otherwise not eligible under return That roar force eligible or not eligible for this on I'm just adding a new column So somebody was asking How do you add a new column So this is one way I'm just saying that I want to add a new corland called Jan born ISS And what is a new column I will pass this Jan Column off store sales to the bonus function And the result will be the new Courtland called gentleness You can see that here Yeah so And beat out there What It will do it So what are you passing to the function So what is this Sales column This will be Jan Right So if you take Jan people take all these 8 12 16 or will be compared so it will be not eligible Oil will be obviously eligible Ah you can also say return and Peter had everything you can about this is a good one So that is how I wrote before I guess But it has to be a list I guess if you're doing like that let me check that So you were writing what I done Oh no no no I don't think directly You can say return and Peter are dead No no I don't think that's possible The Officer Cheever for defining the function And this is the name off the function called Born Its Function Right Now let me ask you one more thing Okay So a little bit about the function Let me ask you one more thing What if your manager is saying What if your manager is saying that you know So this month we will give them bonus If the store sales more than 10 next month it will be 20 right Not every month It'll be saying so how do you do So this pen is not fix it every month So what do you do You will say Sales column Coma I say this number in orderto she would say Do you guys understand this Yeah So here you are passing It does every variable right Well it's not required I'm just saying you scored on business then then But what is more interesting will be this I can say store sales New GF equal to store sales I can say a fly Uh huh Cycle say selectively Types include You quit too in the 64 One moment Okay My tasting narratives and backs where include right Oh So uh now see what I have done right now See what I have done I have So this is the actual use off the apply function So like I said apply is something which will take your function and apply to every column by the fort in the function that I'm calling here is born its function So what is the bonus function doing It will compare each valley whether it is more than 10 Okay And I'm applying to the whole data frame So for each Monday decider if the sale is more than 10 they are eligible So if you look at my new data from I created it has just the Elizabeth and not eligible stores Ofcourse you have only I d right But now you have a problem right What is the problem So this is fine The previous Saturday dies Go on go on In the sense like we created a new data from outof it Right But by looking at this a all alone human or be ableto make sense like with stories Actually Elizabeth are not eligible So that is where you can do one more thing You can actually concurred in it to date offerings like my orders in our data frame is watch Um be afraid Store stays by I can't say I want to add store sales on new the F Together I can do a concatenation but you have to be very careful when you're doing concatenation because I really even your duke and coordination You can either add them horizontally or vertically horizontal adding means you will be added like this vertical means like this so I really their rows and columns should be equal Otherwise you will end up having problems So now my requirement is I have to or data frames and I just want to add them together All right so let's see how to do that Then I will pick up your questions s so I can say something like this Speedy Dort One moment Okay Can Cat I think the function is called Can Cat What are the different names Store sales store sales end nudie of Right Right Ah So what do you think happened So And it is always good to check this And I'm just Let's do one thing that I'm good at what you do What happened I can't catch how many rows are there now 200 So my date as it has 15 Gollum's and 100 rose on I did a corn cat on what happened It upended horizontally below So your unbelief again The weather ago dot tale right And you see So if you don't want this you can say what one knows it I know now you're talking more or less looks like you know easy toe So you may argue that this is much more easy in excel to do who but yeah but if you have 100 Rose I will Lord to come to this to do it actually But if you have one million Rose probably this is not a good idea So what did we do We created a data frame called Nucleus Right where I have applied this bonus function and check with Europe on desperation When I apply this I got another cannot convert string That is why I added this day Pipes here Okay And I created a new B F on now My origin a date A famous store sales the visible not eligible is new The F I say can coordinate them together access equal to one because I want to add them horizontally just after that and look at the shape and print it You are getting their data frame that you want so that yeah So So you have a leader renamed them or because it will have the same problem Yeah So if you know if you're doing boss So you need to lease that The garden names actually What if I say what you cannot If you're picking up Dan you will get through Jan So either you can rename the columns Okay You can just say that Select on a lock off this to the strange and change the names dot columns So one way So you concert dark or limbs and manually reset all the corland names So instead of Dan you can call it as John Bonus or something like that But in the default operation it will add the same column names We are not changing it So but this is also good I think because not good but you can see Dan and Dan Soliz ability Way off off the data I can print No this should not be It should not be if I say go dark columns No they're not nested They're having all the columns but they're sort of like uh how do I say representative Right So one thing you need to our take care is that if you're doing this kind of concatenation store the column names you might have to change your manually at them They will remain the same Ah you can One thing is set against our daughter columns and then again manually add it That is one way to deal with You can process a list and given you need so you can say the F off the car Love name equal to then you can give another corn I don't have I don't know Look you can access columns by numbers When I said dart I lock I can use only numbers either or column no face the corner for So you can use your to use their 0123 like autumn numbers It won't consider the corner booth No there might be a method I don't remember So I don't think so Not in the ordinary later frame You can say select you know Look and then move it I just want to discuss one small topic which is very important but it is not there in your service thing notebook I have another notebook I will share it with you But this seems to be a bit important So can you guys tell me what is happening here Mmm Okay So I will just run this Oh you Nitto import right Hurry on board Import pandas as beauty So you guys have a look at this I just brought this example to show you something What I'm doing here What is years list What is F one visionary So I can create something called a series Okay Bypassing F one and index equal two years Well these things may not be really important because it's very unlikely that you manually enter some data and create a data from usually you read it But my point here is that what will happen is that series is a single column even a single corner McRoy index Even though you say single column it has a row index Right So here if I show you let me um form one Right Okay So is it printing something No it's not printing So if I say a French firm one Mmm This is how far someone will look like Okay what is farm one Form one has an index off years So 90 91 92 These things will be the index the right on Dhe Then it will also have this F one So it will look at this dictionary and find out the values it landed here and created like this Ok so basically what I did by running all this I created our date A friend like this This is my date offering huh So in the data frame the index is this 90 91 92 all I have three columns Former foreign to form three The point that I want to actually discuss here Is this any and values So it is very common that when you donate analytics you will get lord often and values on what to do with them So one of the things you can do is that you can say there's a drop a name 1/3 so I can say so What is firm too Form police a series on here What is this Data from Col De of three Right So I can say the F three drop in the F three dark drop in So can you tell me what happened when I did I drop So my question is is it dropping them Roll by sir Column boys Are you sure Only 93 94 was retained You're always right So it will drop It relies n n values So wherever you haven't nn value that drawer row us dropped and you got only the remaining one That's okay But if I say the F three if I say the f three what'll happen the original later frame is unaffected right So I really you should assign this drop in a or I think you will be also able to do in place True So if you want the order genetic data for himto have this word you say in place True what will happen It will drop from the original later frame on DSO Your original frame will get affected right And usually run Want to do that So you cannot say that to a different So this is one way off dropping the value You can also do something else for example Mm You can say access equal to one What will happen There always are column ways Yeah So access equal to zero means golden rice Like in the sense like it is picking this access equal to one means So it is checking wherever an end values out there and it is dropping them That is also possible right on Dhe Another interesting thing is there's a threshold you can mention Okay so have a look at here I'm saying I want to drop access one that I shoulda quit too Okay So what do you mean by this They're a shoulder So let's say they're assured equal toe what other values we have Okay Yeah So Thresh already If so what is the original air date A family of three Right So they spring that also orange Leo three Okay not here This is the last line in the little print I will say What an image that I shoulda the f on help India three So did something happen Nothing happened the night because my threshold is too So let's say that the shoulders a bigger number eight empty They definitely okay were the number five my B 91 92 So there's also having saying right So what is this threshold value doing so you can define it ashore in this case arose having more than two n n values It will be dropped So they're assured works like that So it will look at the other values And if you mention a number than anything more than little drop always really keep it Why This is important because you will encounter probably millions of n n values in the data frames on you might wantto you know change them or drop them You can also replace them So one common thing that we do is that we replace them You can stay feel any So what I'm doing Phil and a zero So instead of all and then values it will be filled with what Zero right You can also say you know filled with a mean Tonight's for example huh So here what am I doing I'm saying fill in a the f three dot mean so the mean will be calculated or you can pass a function on that value can be filled So that's something we commonly do because let's say you're getting the on well revenue off 10 cos every year Okay now it comes in a date offering right now What happens in one year There is not have any Therefore goto give you the data so that will become in a name All right so now when you start working with it you just don't want to consider that is no data So what we do is that the last 10 year data we have only one nearest missing You say mean off it and Philip dead So you have some value to experiment with You cannot simply say drop everything Right So if it is the road doesn't mean like so you have testified that so what we do is that we replace that in and value what I mean So a near revenue off the company is this one here is missing that mean will take care for that cos all years that problem I mean I mean one technique I'm saying anything you can do I'm just saying that that will make sense Otherwise if you say that value is zero on then you're calculating something It will have a huge impact because that it is as good as like the company that didn't do any business on that year But I really therefore go to give you the date I was something like that So our race it is a good practice toe find an end values and then manipulate them tonight So I will share this notebook with you This has some explanation or so off these Indian values This is not with you right now but I will share it with you so you guys can practice on this now on data frames there is an assignment Not now Don't worry So later But I'll share this file with you Uh what is this Final final banned us with it Black please Band as exercise huh So Pandas lab So there is a Can you see this There is an automobile later and there is something called pandas lab exercise So the idea is that you lord this notebook and try to see if you can get dancer If not there is something called solution in the solution That s pandas exercise solution you have So if you're not able to get it for example I'll just show you one off them How do you upload it You will go here up floored on You can say practice ground as exercise and we want the automobile later And this and they open up your boat Uh and I can just open this Yeah So this is like this We shall now test your skill using pandas package Answer his question So yeah check the head So you're supposed to read the data frame that is not given my answer His question Lord Pandas SPD and our Florida automobile later So what is the automobile later practice here is automobile later listening open interest So the automobile data actually has Lord of columns So there is a make and model of the car How many tires it has What is engine What type That is automobile data on Check the head off the data for him How many rows and columns are there What is the average price off our course which is the cheapest make How many cars have horse fell A greater than this three most commonly foreign cars Which guys are prized except tracks So this is one sample it practice So we give you more like this So it's easy for youto go through An indigestion solution is also there Have you guys heard about cattle Yes right Why It is not giving any output It Oh yeah So for those who are not aware off it Catalyst Probably one of the best places where you can find all the data that you want It is a website which shares data to you You get a lot of data on that A lot of competitions that happen in cattle so you can sign up for a competition They'll give you the data on then they will ask you to find a solution So for data science and ML this is the best place to get all the later you can also see the previous competition results It's people uploaded on and that also can help you get more idea So this automobile late as it is actually taken from can catch it You can get Lord off free rate us It's in cabinet on I have shared this notebook with you It will be uploaded elements Now you can just download it and adipure Jupiter Meanwhile let's have a small discussion about number Right Um see basically what isn ump I like I said it is called numerical Pi Thorne That is why it's called a number by numb by is a special later ah structure library that is available in python which allows youto play around with the multi dimensional Harry's So the basic idea is to create multi dimensional Larry's right on dump I support something called Eddie Okay so the other can be one dimensional toward I mentioned three damage That's what it is the practical application Mentor Come right now for you You will see number Ira's later When when you start working with deep learning or any other projects passage the number is very useful And the pandas and data friends were actually built on top off number right now But so that is like very simple So you're saying that I'll get there is numb by But I have a very simple question to you Can you see this distance and speed thing I created What is distance and speed What are they They're not really my list Sonata Molest you square bracket Now don't run this I think you're on how The notebook So I have a question I'm saying time equal to distance by speed Okay that I'm saying for ending time what will happen I mean you can take the Oh Okay So you're saying 45 by fire off a by 10 35 By seven Absolutely wrong Nothing will happen You get bunch of feathers It will not work right How it'll work Oh it will No it'll never work Actually the question is white No I I it's a collection A list is a collection where if I want to really do this I should write a four loop I will say for every element in this take every element and that doing our division I mean why am I saying this is that But it's very common that I'm looking at some data points So again like the uber data I got over data Probably not an excel file problems Some other format where I have a distance covered by each over driver in an area or a list And I have the speed off each card Okay And I just wanted your distance by speed But I cannot do this here on list If I do it as a list or dictionary or anything I will end up in their era for sure right So I mean I just wanted to show you that it is not possible first Then if I say it will make so much of the area are getting if I scroll down unsupported operations for this list right Well now what I'm going to do is something very simple So I'm just going to important m pi as MP This part is not required Okay on Look at what I'm doing So this what is distance It's a list You can convert this to a number I r A By simply typing away off distance Then I can say s so you don't have to say and be daughter anything Did we imported separately Important empires MP Okay let me do one thing Come in these things for a moment So let's run them one by one Okay I know you're to say MP darch So this is how you create a number ira from let's say a list called distance So I'm just converting this into a number ira on again The speed I'm converting impede or carry on Then we can just put in them So let's have a look at how the data looks like right now the data looks pretty much like a list as off now Okay but now what is interesting is that I can simply do this I can directly sir Distance by speed on the result is time And if you check the data type of time it's an AMP I Indian mission Larry That is how you know it's an added uh and um piety Right Excuse me You have already voted on it You know I override it actually So you're to say impede or Terry I really I always only important number SNP So whenever that import statement is that you're the same three dog night now So that the real question is that why this is useful Why this is useful Because this numb pie data structures like raise our pre company pre compiling data types Meaning if you create an umpire a off Let's say indigenous like this they're already compiled on any operation You write on them like addition subtraction They can apply that on all the elements really fast So a new empire operations are 10 200 times faster than normal list operations You have heard about this four pan right For primary language Fortune for train is considered to be one of the fastest languages ever created Sanam by a operations are as fast as the fortune operations because Once you create an airy it has three built functions and data types on the area So when you say some it takes a ward I say as you know an invasion and apply whatever logic you're saying So the common use cases will be that I have data points in different formats like Lang were turned oil on I want to play a common operation in them right now A question to you Okay so let it be there are able to get there This thing this notebook Now you can also do things like this Okay so whatever this now you'll be able to answer my question Uh can you have a look at this court I'm going to run Will it work These three lines It will work but what happened Concatenation Right So then you say act to list what will happen People can coordinate the list That is a default measure off a list But you don't want that Your idea was to add the you know individual numbers light that will not work So but what I can do So this is how you can create an airy I can simply say this in bed all day in one moment where it is giving that f plus de Okay I'll just common this I'll tell you why it is not working or we don't have a fancy right from murder Getting a fennel my friend She's there Right So I'm just creating a rabbit 12345 On 10 2030 40 I can print a print to be but one thing to do Check because I was trying to do a f plus g I got another right I didn't have blessed g I gotta matter Why do you think we coordinated Because if you're manually creating an empire a their dimension has to be exactly seen What even by dimension the number of rows and columns So if I add a 50 here it should work It works All right So if you don't have plus g this well but you can also say I want took print ad five with every element So all these common operations can be broadcaster to the elements in there That's our advantage of an empire Are you able to run these things But if you closely observed the data types in desert 32 All right In the idea that later type is in danger 32 So when you actually go toe a pandas and data frames it is indigent 64 floor 64 So they allow I don't know Like more space to be there I have a question Don't try this What if I create a number Ira with Let's say 1234 Will it work Will it not work to live through an error No Now by its numerical right What do you know Bring death right Huh So can you tell me what happened here So ideally you should know Do it The Nam pie is in numerical pattern So you're trying to play with numbers on string Values are not really you know advisable But if you're numb by Eric and dancer string value what will happen So why this is happening That is Ah the reason behind this If you look at the data types that is something called object This is your string huh What Something art object Then that is float Then there is indigent The lowest layer off later Davis in digit The next one is floored Next Oneness string So if in your name pie Ira you have a mixture of Florida an object Lord I might really cast everything a string if you have a mixer off in detail and fraud Lucas to this float right That is our lord And I legally changed the later so like now in my area How some numbers and then one string So you're saying everything I will convert the strength so that it's not a good thing to do I really right But if you encounter something like this Sorry Be type B type huh So this you won one is object That is an object How do you know if I remove this Let's add it as five What is that Indigent If I say 5.5 it'll be worth it Floor 64 So I can do a DIY type for that So now some very important functions you need to remember in Mumbai there is a function called a range It's got a range So I'm gonna commend some off this Um let's say we don't want this Okay So a range is a function It will generate a range of numbers where it is again and be Doc No no It comes by the fort Be important right It is NPR tote a range There has to be some problems tested The NPD ordinary than Aaron should work automatically You don't have to say in Paedo Tyrion and pulled out a range that it's no the idea I don't know Huh Eso Ideally if you say you run it at all I really died You can simply go on a range It should work I think that is I was trying to mess with the court I guess you don't have to say you can say NPR a range It should work So basically what is a range It will generate a range of numbers and you say pen So it'll be from 0 to 9 Right This will be useful in case You know you are looking at some sampling data So if I say 15 it will give me what 0 to 14 So this will be a vampire basically now where this can be useful Yes You can also mention this So I'm saying five is my starting point for physics is the ending point and five is the step function So can you see this I did uh and peed on a range fire for basics and five So what will happen to start with five on five is the step functions little Add 55 10 15 to indeed up to 55 It will never include the last element There is a difference So if I keep it for If I hear what will happen what do you think will happen Well it friend No It'll print on Little 15 because started fire Add five Little 55 Last 11 Disarm it Er so it's like a step function to generate Numbers can be sometimes very useful because uh you know you want to generate a set of numbers and work with them And there is also a very similar function called Lin Space Aaron gentle in space are very similar The basic differences the split news on it once Then you can understand the difference I don't know why Either is an idea everywhere I think I was doing something I would just hash it Yeah so that is our mp dot Lynn space function and what it does is very simple So let's say I want from start be 200 Okay And this depth Best fight Yeah Um can you take a guess I ran this and I got this What could have been happening So when you say 30 and 105 1st will be 30 last will be 100 So to our numbers are fix it All right Does this fix it on Then what is this Five So you will get five quid extender numbers like 30 to 47 47 to 65 65 to 82 82 200 will be saying number 60 So on What is the formula So how does it calculate this number between 30 and 40 700 minus 30 Divided by five or four I think For five minus one Right Can you wantto Can you tell me where is the number 100 Minister Table before 70 70 by four days Work That is what you see here right Yeah So the formula will be 100 minus 30 divided by five minutes One for their soul in space work So And you can do this for any number For example I can say I want between one on and then I want some 32 distance numbers What is it Because I Okay so the output will be in fraud Because why that is because in many cases you will not actually get the invisible So the output off you know space is normally castor to float for position So here you see imprinting between work 1910 32 Lister numbers Right So like that So this might be useful Um are you able to run these things So now comes the real thing right So dimensions right This was 30 point It won't So if it is a whole number in digital you just said therapy If you want to say 32 to 30 right now the answer is 30 Not that is the first starting born right You're you're creating numbers between 1300 right So thirties and indigent So it won't print our tip 1000 30 Generally night See a little bit of all dimensions Right So you know what is that dimension right So normally we say there is one dimension that a steward I mentioned and that is three dimensions Is that as the Lord I mentioned boo thing that it's a zit or dimension No David Iglesias Practically No Actually there is something called Is there a dimension of darkness The dark dimension No actually Dr Score does it What I mentioned to us that is not animated or anything In mathematics it's cardholder I mentioned Zoe God one dimension is what we created This Ari one dimension is what this number I Arabia created Right So you're saying 1234 This is one dimension A to dimension is called a Metrix Not do anything more than two is actually a Metrix three ranks three dimension is also a Metrix But see when when you visualize it that's totally different For example if I want to visualize three dimension I will say probably a globe history dimension and a circle is to dimension But in case off Nam pie you can have 2345 any number of dimensions and you can visualize them or so not like really annoyed but normally nam posters So let's have a look at dimensions Ah here If you scroll down what are we doing here I'm creating an attic And you see these other numbers I have so I just want to selectively aren't you know aren't common That is why I'm commending everything Okay People are thinking like why are you doing this Yeah So basically what I'm doing here is that I'm creating and Eddie Right And what are the numbers I have 30 to 45 extra tracks Attracts a trap right now One off The very common commonly used functionality in Nam pie is this three shaped function So what is this reshaped function So I'm saying that they are not reshape eight comma tree What will happen this We can work mine amp ira Indoor aid by three metrics So you will have eight rows on three columns Can you see here 1234 How many rose Eight Roseanne Three columns So how many dimensions this has and to your dimensions But we will normally call it as a eight by three matrix Right So the shape is very useful For example I can say I want to have something or data to I can't say Reshape to coma Coma How many Elevens we have Norton 24 Total numbers are 24 so I can say to coma to coma do Times two is four Four times were different If for six I'll tell you so So right now let's say you do this This is three dimensional attic How Because I said reshape Toko Motto Coma six What does this mean to Rose You cannot say two columns because it is three dimensional accessory will have to access one will have to access to will have six So if you look at the data it will have to rose one roar duo in each You have I do Each you have two on didn't teach You have six elements Can you visualize That is how what three dimensional array will look like Well it is not I mean important to visualize this I mean nobody's going to ask you like I'll give you a data visualize in your mind But sometimes your data Mathematical calculation sandal light The data will come in This format of data will have a three dimensional matrix format Now somebody was asking So take a desk if you want You can go right now Nothing Now this is actually three dimensions right So this this venue will reshape You just passed to the argument that the three dimensions are they like how the you visualize is a totally different thing So one day I told you that you can just imagine so if I'm looking at this six this number six it is in the first So you have to write one too So in the first access Okay you have two columns in that Again The first The fifth element is this That is how I have to access it So the three layers I'm saying three dimensions other to the state Actually Now what will happen if I try to do this Can you tell me why I say to comer Toko Martin There are absolutely error Why you can't fit right toe tomato Martin will be What How many elements 40 How do you fit their 40 into our You know this search so the shape actually matters So what does that Toko Marto coma six Then it should work right How do you make this four dimensional 44 Yeah So sometimes Yeah so that's fine But sometimes it is more easy to add a one right That is four dimensional light So this is how four dimensions looks like good stuff on an island stuff All right well it is difficult to see but you got diarrhea right I mean the numbers are in more backers You see more dimensions You have I don't know how to explain this So some things are really hard to explain Don't worry about four dimensions right now So So right now I think eight commentaries enough photos because the word I mentioned that easy for us to work Uh which is that too Okay date eyes So you can do things like this For example what is data Oh I didn't run Reshape thief Problem is this all will get confused End of the day So your original is data on Let's say we keep it as data too I reshape it on Let me just print it Data too Okay Is it working Yes That is working So I can go this I guess Yes And I can bring this So this is very similar to your off What you say Pandas concept in pandas You remember we had a filter so save me what I'm doing here I'm saying that big data What is data to your metrics Percentage to equal to sort out What So what I'm doing even numbers right on If I just printed I get through for us because it is getting a plate on the metrics on I can just pass it as an argument No filter my NATO Right So I can simply say later toe even data and you will get this So you're getting only even numbers on the Eddie Just as an example I'm So what So you're calculating the weather The number is divisible by war not to get even Oh Morty again Say it right so better I saw this one pass through each element of the matrix right on The output necessarily may not be the same structure because here it is properly eight by three or something But not every element will satisfy the conditions So here it can be a normal Larry Outward Canyon or military You also have a method called impede or zeroes which can produce for peace Rose if you want a light But this might be more useful because I can say in Peter's a rose three comma five plus six So what is happening You're creating a three by five metrics adding six to Avery's They're also you'll get Emmett Reflex six and six So these things will become handy when you start going for them in machine learning Okay so not right now You may be wondering why you have so many zeros but there is also a and p 0.1 Smith which can give you like once Right on dhe I can say five by nine Any shape I want right What is an identity matrix Anybody know where it's an identity matrix If I'm creating a Metrix that's a four Right Um Then where do I go What is this This is the identity matrix What is that If you flip the rows and columns a mattress will remain the thing So diagonal values I one right That is fallen identity matrix Now this might become useful but you can actually create an identity Mathematics I can simply say impede our I and just say five question to you Can I do and peed on I five comma six Now don't run this I just have impure Five will create Firoz Five columns equal Can I do in Peter Identity off my coma Six I want fire all six columns Yeah So theoretically this does not work as an identity matrix but strange it is It will create a five by six So if some concepts are very confusing the command is trying to create an intimate tricks But this is not identity Right now This will become right So probably close One call a man short December It is Harish I absolutely have no clue It is working No one asked me Okay Why him and I don't know If so I've been trying to find out Simon Gore knows how it is working But normally a identity matrix should have equal number of rows and columns Then only the concept makes sense But you can also say one more columns and then it just adds I don't know Maybe it works like that Now if you wantto bring something like So what is happening here I'm just feeling the diagnose with the numbers 13 the identity matrix will have ones right But in in certain use cases where you want to you know feel it You can actually feel it like this right Uh well that that is all you need to know right now about Nam Pie Not much just to understand that you know that these operations are possible So um see when you when you're building a machine learning mortal or something like So let's say your customer comes to you and say that he gives you some data and you want to build animal model on top off it One thing that people do is that they will actually build the model intestinal But it is always very good that if you can come up with a story about the data so visualizations actually help youto make a story from the data Sit on then show toe your customer off somebody so invite only have a library card A Mac bluntly Okay that is actually the origin A library for visualization But these days it's very rare that you use math lot limb directly We are using a library Corsi born on top of it So Seaborn is the action library that people use Seaborn is built on top of that bra clip and this has much more visualizations That is also one more visualization Likely that city fall tabloids be a business that is not this so invite on That is a separate life buddy That is Ah okay Scaler I literally ordered but mostly what he uses Seaborn But I'm talking with an extra library So now let me ask you something So let's say you have some data Okay Let's say you have only in danger So they knew Say visualization Normally they say there is something called a uni very eight On by radiated There is something all you need vary it on by radiated You know very it means you are looking at a single variable and then you're visualizing that it by radiate means we're looking at two different variables And then you're visualizing the data For example let's stay The age is off people in this class Okay so let's say I want to plot a graph okay Considering all the ages off people in this class Okay so that is my intention I just want of blood a graph and I'm just considering only one thing It is nothing but the age of people in this class Sorry So So can you tell me what kindof visualization Not in a fight on in general If I'm asking you to plot indigenous what will you do Do you know any names or stuff if you're having indigenous or floating numbers And if it a single you money Very eight It is called a hist O gram scored a hist o gram And how does it hissed A gram Looked like it will look like this Okay And you will see this kind off like this So here we will have the age off The people all right so different from ages will be there So they said Oh no Like 2030 35 40 Bye Charts normally is categorical right Different categories You want beans This is beans So this is a didn't actually So you will have an aged between 20 to 2500 different Oh have been on then How many people in that So basically the idea is that you have an axis where the invasion or the float How many does that number is there And that'll be sort of like a bin Right Arrange night So 20 to 25 32 35 on then how many are there in that So I have like five people or 10 people So you look at it it looks similar to this I will show you this practically But I'm just saying normally now in the same thing if I want a string So this is like people But let's say I just want to draw a graph off people from this class based on the city You're coming from the Bangla Chennai That's a string column What is that car that started his toga Damn that's a bar chart right So that is about well there is not much difference between these two The only difference you can see is that here we will have our own A Mumbai Chennai bangle Lord on these bars will be separated They won't be together in a history Graham These are all together here We would have separate separate separate ones like Mumbai It's in Bangla because you're basically looking at a string column and saying that how many people are from Mumbai from tonight from bangle categorical you will have spaces between the distinct bars in this That's what I call a bar chart Actually So in university it normally you will grow a history Graham or we will grow about Chuck That's very common Now let's talk about by varied where you have to variable toe compared I want to compare which one Ha ha ha So usually we say there is something or history or you know the other one is bar chat right now What if you have to our variables right By variant analysis for example on just considering to ah indigenous for example I'm considering toe Indigenous for example So I'm looking at the age of people on then the salary So I want to compare age and salary So both of them are work indigent or Florida Let's consider the masking teacher So what kindof plot will you do for this No Dodie X Y X rays the hallways Yeah Who's that Scatter plot It's a scatter This is Carla Scatter plot white because your graph will look like this Huh So you here you would have I don't know age here We'll have salary right on Then you will have people like this scatter plot so each dot will represent a person a combination So age of 30 and a salary It would be like this display so you'll have a lot of darts like this Now this is a bit confusing Confusing in the sinister plot is good but normally you just don't want to blot it You plotted And it is in most of the statistical methods you will learn statistics Afternoon What we do is that we try to fit a line into the blotch so you will fit a line like this Why Because this will show you that trend off your you know analysis by simply looking at a scatter plot You don't know what is happening There are so many things right so that our statistical methods using that you can fit a line So now if I fit a line the Linus in the middle and it is like this So this means there is a relation as the age goes off the salary or so goes up in some other scatter plots When you do it's so you're floating something else If you fit the line the line will fit like this That means there is no relation So you have some commands and methods to fritter line using distributions You can call a distribution and say that I want to fit a line tow this I never show you So it will fit it and show you that What is the pendency off your graph Alright The words watch side you have So this is called up Scattered right No What if you want to do something else You want to look at people who are coming from different city by buried analysis You want to take a look at people who are coming from different city on Dhe Then compare the salary So I want to find out whether salary has an impact off a person who is coming from a city So that is again by radiate But what is the problem here My variables are the city's A string Onda salary sir Number What will you plot You can't block thing Scatter will not work My Bob will not work Bye bye Chart Okay Bye Charge the unknown discussing now that's off topic Let's get that later You may not Football boots There is something you have three bubbles Like complied I mean so technically there is something called a box block I mean this is what we use in Seaborn So I am so there are 1000 types off Lots were discussing specific toe Seaborn like are you analytics light So you Because you are learning for more from the statistical point of somebody somebody ask you like what kind of a plot Your answer This block box plots actually And I don't think maybe you have Ah seen blocks plot It's very red light but the idea is very simple What box floored us is like this So let's say there are three cities you saw City will be here Bangla said night Then what is that Delhi or Mumbai Something Three cities And then you will see something like this It's a veer dah Enough plot I don't know how many of you have seen this Have you seen this kind of a plot Probably Have you seen some beard A plot like this is gonna box plot but it is more important on disinvited me Insight So So what it actually does is that there is something called mean and median You will know already meanest mean So what is the difference are the same No median is different For example you are calculating the average salary off people in this class We are all here now let's say I invite Mark Zuckerberg Go sit in the class on Then if you calculate the average salary all of us will be millionaires That is mean investor What will happen It'll impacted So Mark Zuckerberg has billions off the whole insanity You calculate the mean What will happen The Everest class salary will be 10,000,100 million men You're can easily become millionaires That's mean But if I do the same with Median what did this It'll order the salaries Indy ascending order and pick the middle value You'll get much more sense on the later right So in the box plot what happens is that it'll pick Bangla broke and the salaries will be arranged in person Tie in a median fashion and show you the middle value So like if I look at you can see that people from Delhi has more salad because this is the median off daily and this is more than this So by looking at this I can say that people from Delhi has more salary comfort to people from Chennai compared to people from the angle So this will draw you percentiles 25 percentile 50 percentile 75% time on box It like this So when you have an indigent and string normally what we do is a box plot to get an idea Now the point is that what if I want to further divide this into male and female So going from your by very eight I want three radiates So I want to look at bang lords and ideally I want deplored the salaries within that I want male and female so things will become more complicated You will see some more like this here One more box will come That's how the drop that thing will work Because now again you're categorizing work male and female But that's not a good idea Because if you know 34 variable plotting plotting with work But you cannot make sense off what is happening right So So the visualizations are not very powerful But so now believe Udo Univ idiot And by variant analysis anything more than that It's not typical yah Good idea to do And that is exactly where your Emmel and all are going to help right Machine learning And or so for example I have some sales execs on my sales are declining every month so I want to know why the sales are going down so I can easily draw that off I can say the age of my executing on the sales he's doing They'll give me a graph and show me that OK probably that is in this age But now I'm thinking I want to include the age off the person the sale and also from its city He's coming now I want to include age from its cities Coming is he married How much distance he is traveling So I want to include that say seven or eight parameters so you cannot draw that That is where you say it Call a machine learning I'll go to them Let this guy walk on or these features seven or eight features and then tell me my my sales are declining That is we're actually a mailbox right So basically you can say that it is an extension off your visualization not visualization but problem solving Basically right So sometimes I handle the interplay of Mel class and I give a very interesting example but this is not a letter to visualization So we normally people ask a lot like what this machine learning can explain it you know So people talk a lot about machine learning but I always give them a very simple answer So if you drop Gordon like this right So let's consider force and ideas So first quarter in this call that KK So that means there are things which you know that you know Right For example um you know Java for example So you have a job a developer so you know that You know job All right I mean that is your understanding So first quarter in this where you have the knowledge of something that you already know the second quarter in this k d K You know things you do not know So what is that for example mission learning You don't know machine learning which you know Right So that is that second card So I believe that things that you don't know that you know All right I even don't cash me And I even told this uh what you say skiing right Have you ever been skiing now right Yeah So I went skiing and these guys fit all these things to me on Then normally what they do they will come to you because you're not skiing right And they were slowly slowly make your ski But for some reason I don't know whether he didn't like me or not He just pushed me from the top Then I was in the third quarter and I knew I know skiing I really escape I mean I never thought I know skiing but he pushed And I'm like Oh my God I'm actually skiing I saw so third quarter And these things which you do not know that you know you learn from experience or feedback Then why did the fourth quarter and things you know I don't know that you know No that's all right Okay Maybe thinks uh yeah things which you don't know that you will never know right Well yeah So now how do you put it in perspective So this is your normal analysis This is like your sequel queries You know programming All right because things which you know that you know So you have some data You want to write a sequel Query You know really great eyes there You know the query You know everything on the second quadrant days things You know that you don't know For example this is were business intelligence and or comes into pictures B I for example I have some data on dhe I have ah visualization Cold like tableau So I'm plotting a visualization graph O graph So things which you know that you don't know So I have the last year sales later but I don't know whether my company's doing good or doing back So I used a visualization to little bit Should I said data after it does I know that Okay I know what is happening Third quarter in these things which you do not know that you know those So that's like a feedback mechanism So some of this our data mining and all will help you to do this So data mining and our force in this this this machine fourth quarter And this the gold mine That is where you are Meaning somebody is going to give you some data And they are going to tell you that Tell me something which I do not know about this on I can never figure it out which will help me as well That's where you fit And that's actual definition right the machine Learning is where you have some data on You have no clue I mean you already know something about the data but you're trying to figure out something which is not possible to any three cards All right you're trying to understand something which nobody cannot otherwise give you an idea boat So we're just going to get some data on your toe get a useful insight from the data and then show that's all you're going in Mm So four quadrant is actually where am l algorithms and or will come into play so usually do this example And that is why this is actually very costly The people were working in this quadrant so this may be the lower salary I'm not generally but then these guys might get more salary more Sally this is the highest salary Probably lookers for plotting these graphs I'll show you Yes I'll be used Only Sieber bluntly We don't use much So the only difference between Seaborn and plotless like bluntly can have this kind off If you hold your mouse and dark it will show the data See Bone doesn't do that If you do a plot you can see the floor It'll not sure the numbers Broccoli is much more interactive Apart from that not major differences are there between them So let's look into this visualization also from the you know not book Find off you So you just need to upload a notebook I'll tell you which notebook Just go to a pure heart Onder de stop You should have a notebook Can you see Despite on visualization just open that uploaded or so they need some data so I can say off floored this housing prices That is a housing process The data you can upload it What one is the notebook When is a data you're going to use Well I'm not covering everything but something which is important for you So just run this first cell and also the second cell So basically what are the importing We are importing pandas SPD You are importing Seaborn as a sameness number as MP mad floridly pipe lot spl t So we're importing Matt wrote Lipsey born on the last line You see this math lord live in line What This means normal Even you poach graphs The problem is that the notebook will not display that graphic deported but it will not display on the screen So you are to add this in line functionality so it will display that in the cell That is why we are adding this inland On what Here what we're doing here is that we are just reading the housing data and just look at the shape off the later So this data set is actually quite complicated in the sense it has around 81 columns Uh Lord of Garlands actually So if I show you that that is such housing prices see this has around 81 column so it's very difficult to show the old columns But what it actually has is that this is the housing sales data from us So you will have the square feet number of bedroom number of bathroom and all the properties of the house basically on the sale price And all right so there are different different columns You can actually go ahead there to understand how I see it So you have Ah Lord friend is Lord Area Street Lord shape utilities pool area These are all the columns we have so many Many columns are there So read the later You can also look at the column So these are all the problems we have What is the foundation Where The basement square feet What is the basement ex Portia Where's the bedroom Square feet and all these things are there You can actually are able to read the data I mean Lord the data Yes right Yeah So let's look at the uni variant analysis So what I'm going to do I'm going to do a beest plot on the Lord area column So Lord area is a colon which has indigent kind of value anomaly in case us off Ur Seaborn When you want to do our history Graham you will say it best plot display It is your history them and it is very easy You will say sn s dot best plot and you will say I want to look a TTE lord area There is something called Katie It is force I'll tell you what it is But if I run this this is how it looks like So this is Ah hist o gram which is talking about uh watch the Lord area on There is something called Katie Ikea The yeast and sore kernel density is to me I'll tell you what it is It is false right now Okay but can you make some useful inferences from this plot If you see this are you able to make any inferences from this plot No no it's Yeah So what A one in France which I can make is that there are a lot off out layers in the data or what you say that it's a lot of schooners in the data meaning a lot off data is actually shifting here right I mean most off the you know our data that is coming in this 600 or 500 range like this Strange actually And here also have warrants for their very very smart So this is called out players in the data our players means in some of the machine learning problems and or you need to remove our players Right So a common example is like I said the salary off marks hooker but like you know So if I'm plotting the salary of everyone and if there is a millionaire or billionaire that's fallen out player because that may impact our overall you know analysis in some cases in a mill algorithms we consider them as an outlier in some cases will remove them so to avoid missing the data So this actually has our players This is actually shifting towards this time Okay It's not a U uniform distribution that we have That's one thing If you keep this Katie is true What will happen So let's keep this Katie is true huh Wait Well this there is only one dimension X There is a lot area So you are You are taking the Lord area and the Lord area starts from zero to let's say bullock or something And then you're finding how many points are there So there are probably 200 or more than 100 houses in this Lord areas of Roto 50 or something And then again this much like that All right so they said the number of houses which has this lord area So if you run the split Katie equals true The only difference is that it was sort off like draw I have graphed kind of structure within this and try to fit all the points so the cartel will be one So you will learn about Katie distribution later But basically what it tries to do is that it will try to fit all these points within Sort of like a graph So that is some miss actually one if you add them together Okay These are technically used in statistics Formally range F X axis you get so right now you cannot There is a very remote place I'll show you how to do that Okay um so now if you're looking at this have a look at this here I'm saying an s tennis dot Count blotch Okay And what is the thing that I'm considering Here exterior First on data is housing So let's just run this and see what it is So what is this Can you tell me what kind of a graph is this It is a bar sack Because what are you doing here If you want a bar 30 will say count blotch on the exterior the courting the exterior of you know courting what What you're using that is this denial metal you know So these things settle pelota bar chart like this Now what is the second line 1/2 added so If I just commend that I'll tell you so This is useful See the second line is actually what I'm doing is that I'm taking the X axis Andhra ticketing it a bit toe fit the points See now you have been ill and everything but you're not able to read them right Because that our collector so to remove that it's like a standard line off court You add this court you will say set extract flavors You will say get rotation 40 So basically what we happen they will just spread evenly so you can read them So this line you can add if you just want to read them So I know you can see that most number off them are Green Island Least is what Something is break or something So this is how you Porter uh watch what What is this But tonight Okay give me So let's goto Yeah so have a look at this So what is this This is called a wreck plot in case off Cibona scholar wrecked Lord So what are you plotting here It's a buy variant analysis You are plotting Lord area and sale price So you are thinking whether there is a relation between the Lord area and the sale price on Lord area and stale price Both our work imagers tonight both are indigenous So what is that It's a scatter right on What it will do is that it will fit a regression line along with this That Is this a regression line This is that the regulation line that it fits now I think I have 100 it Ah So I have added it here So let me show you something So now again if you look at here even though this plot looks nice that a lot off out layers Look at this point I look at this point So here is where the density actually shows it very clearly But you have Lord off out liars here So this is your sale price and this is a large area So there are some houses like this where the Lord area is very high on Then the sale price is very low or so on here You know you have some houses where the Lord area is very less but the sale price is very high but they can be very less not very more but so you just want to remove out layers You don't want them so there is a way you can do it So in Seaborn let me see if it works alone Have a look at this score What I'm doing there is a function called Quanta There's a function called Quanta What I'm doing I'm picking the Lord area I'm saying that I want Juan Tile boy and Fight boy 95.99 What do you think this is doing This is finding the person type So in the Lord area I want to find what is the 0.5 point 95 or fifth or 95th or 99th value if I arrange them in the media in order So if I run this I can't see 950.5 Is this 0.95 at this 0.99 aces Now there is a problem because uh boy and five is 9478 But when I go to 0.99 the values very high 37 fire 67 et cetera Right So because about layers because that is very high So what This point if I didn't Comedians median value is around 10,000 Right But you are the highest value is around 37 5 So that means clearly there are some out players in my data So if I want to remove out liars I can do things like this I can simply say I'm again doing ah flawed But I'm saying that housing lock I just want to pick the Lord area which is less Dan Quan tile 0.95 So I am removing everything which is about 95 on If I do applaud now Okay so we need to housing sub rights or data is housing step If I do this now can you see So now what I did I just say that it has housing has sub Okay now I'm plotting again with Lord Tyrion sale price Can you see It's much more You know what to say without out layers Because I'm considering on little later till 95 points on the points are much more You know I lying together So by looking at this the half we can probably say that is a radiation I mean the you know lying is actually pointing towards upwards So you can say that as the Lord area increases sale price Also so one way to remove your out players is to use a quantity So fight out the quanta la Onda get the beauty and value Then if you feel that after 70% days every value is very high You see I want to filter it Ivan warned after 70 Then put it on removing individual values I don't think that it's possible Like I cannot pick an individual value and removing some of the values who can move like this No I just took their appointment Fight can be any number So right now when I did the quanta ll I saw that Europe 195 17,000 Right So my media nous 10,000 Okay that is a major So 20 up to 20 years in this fine So up 2.95 I'm okay but when I got a 900 huge difference so I just thought I'd keep it at 95 so it can be any number so that you need to decide and uh take her decision What number you want I run here This line off this boy and fight so that is needed for this act so that nobody comes in or the way number 50% any number of data you get So let's see how One million points Okay one million houses are there And price of one million houses He fights a point fairly minimal value I'm glad I use them in the order Don't give me that right No that is depending on the column That is depending on the call I'm right You're not looking at the whole data So hear my analysis on Lord area So I will first pick the Lord area and run the con tile to get how the distribution is I'm running Kwan tile on the Lord area column It doesn't consider a date I'd consider only Lord area Carl So what did This is Very simple That's what So I'm saying North Korea right Lord area Dr Qanta on Then I'm saying let's a point or two just as an example Okay so why didn't do it Larry is a lot area in the ascending Arctic So not a really start with 12345 Whatever number you have on then it will calculate the median off that So if I said 123450.5 the middle value will be your first point to our 20 percentile Whatever value comes in that party begin So then you can look at the data and say that Okay point Oh so this this is roughly there to Indy It's 20% off the data If you reach at that point is the value coming So then you know Right Okay whether it is an outlier or I should remove it or not And then you can make a decision that's up to you to decide howto make a decision But this is one way of doing it Yeah So now you see this is much more uh what you say Okay so I just want to pick up one more interesting thing So basically what it is if you look at the coral himself our data that is something called square feet There is something or square feet but uh what is the original data But the name off the data for me created inserts a LeBeau housing Is it hosting I don't know Housing Right So when you look at the columns right there are certain columns where this choir feet comes into picture let me show you Where is it Huh So here you have basemen The square feet here How total basement square feet Then again some basement Then again some basement square feet So basically what I'm doing here is that by running this I'm just selecting all the columns where there is an SF So you want to do some analysis based on this square feet Now imagine I wantto plot a graph where you know I want to pour Plot my paper graphs So how many columns are there It's square feet Nine columns I have nine columns where there is a square feet in it Now for each column I wantto photograph because that is basement square feet There is strife it off the kitchen total square feet for each I want to plot a graph My table drops I want a plot But all in a single can waas right So if you want to do that let me split it Otherwise no this is ah list comprehension we discussed right ext Rx in list This is called list comprehension because all off this is within a list Okay so I can't say Call I'm named for Carl um name in housing dot columns So basically what it does No no For the housing dot columns It will say if a surface in the courtroom name it'll just return it That's what started as a self underscore columns So if I look at a cell phone the score column somewhere he'll be there SF underscored Coram's You will get all the columns where there is the SF Square Fredericton So now my problem is I just wantto plot nine graphs in one graph Okay because I have nine square faith data for each square feet I want a plot to say probably my customary saying that I just want to know whether the Squire feet off my basement affected skate off it to see are probably the kitchen square feet Actually it's affecting the total steel arm under total I want each area to be plotted So how do you do it But in order to show this I need to know first show you something Go for this huh So that is a very useful function I don't know how to show this Can I reduce Uh can you see So what I did You have the notebook with you right Uh there is something called a sup Lords you can say number off Rose three Number of columns three fixed size penned by pen Basically wanted Little return a canvas to me where there are three columns and three rows Basically it is like a subplot So what you're doing here You have one canvas within that there I like nine individual graphs within that So that command that you need to use is that floor door subfloor How many rows and how many columns on If you change this fixed size the total size will change That's only what is happening So if you run only this line what you get is this plot And if I want to get the first graf I can say 00 Then again there's a row one ex cetera So if you look at this score So here what I'm doing is for I in range Corland on Let me see Let me see if I want to print it But in those on problems Oh am I just increased this form then I think you are able to see that right Uh yeah Let me see if it works Okay Mmm Have a look at what I'm doing This first line is basically plotting a canvas where you will have three year olds and through Corland So totally you'll get nine small grouse within that Then look at this four Look for I in the range off zero coma length office of course What is linked Office of column Mine So I'm opening a four You I'm saying that the ranges roto the nine So 0 to 8 I'm saying rules equal toe I division three columns equal toe I percent a story So what will happen for each So this will Basically if you run this court it'll print 00010 to 1 So can you understand how it is printing this Because first I will be watching zero So if your exit or it was zero more luxury this one you will get so right against the EuroPass industry will be zero So this is like accessing the first paragraph If I say 00 I will access this graph I show you how to use in the cord orifices 01 of black says this against your movements Alexis this light So I'm just first writing a simpler for loop So I have the index positions to access the individual sub graphs Right So basically if you're running this court you will get this on Then what are you doing If you scroll down in this court I will say rows and columns And again I was I want to write a reg explored OK where I was x equal toe columns off I So this will start from the photo Why Equal to sales price data is housing If I run this I should I really get it Can you see So basically what is happening here is that again you're doing Ah explored This is our explored Only the only difference here is that you will say SF column Soft I So what will happen initially It will be columns off I means it'll go toe 000102 extra Why Sales price on data is this so you can plot my tipple graph So here you can see probably you Look at this second graf there is this basement If i n s f tow the graph is very linear The middle line right It's very horizontal So this means there is no impact on the basement if I n s stuff toe square feet with the sales price But if you look at the first graph you can see that it's a very steep claim So that means there is a impact on that and I So you can also plot like sub grouse and all if you won So just to show you a simple example So that is so you are plotting the basement area of its sale price or each off this The Y axis is actually sale price just right It is a library It analysis by what you mean by bayou area There are two variables What are the variables One variable the sale price What other variable Other mind The rebels are there square feet off basement square foot off this So for each this X axis is changing Can you see this X axis is work basemen s f one This is why basement SFO This is basement unit but I access is saying right sale price So for each square feet you're plotting against the same So imagine like you have a house with 10 rooms for each rooms by a feeder floating a graph So you have to pass the Y axis right What is it Right that time passing this one Right So basically here What will happen if I call this I So this is like the first If you run this for Luke what will happen It will generate I Shorty right 0001 is calling this basement Whatever is that this media But this column whatever you're having the colony applauding Onyx You need something on the X axis right So each coiling name I'm calling because my you know where that's a self underscore column is a list So if I just bring the suffering that score Colon let me just print it Okay Um sf underscore cars right If I just pretend it it is just a list It's a friend A scorecard is just a list right from this I'm calling the individual What values Get on it again and I'll show you So this is the X axis right What is X axis from my stuff Underscore columns 1234 each Each one Did you understand this Give me one moment Let me go up Did you understand this This line No So this is a re created itself Corland Great I think Probably You missed it Say the point This Where is it If you go up my orders in a later I scarred housing And if you look at the call himself housing I have around 80 columns Right on in that there are nine columns which has Ah square feet So we have those columns What is it in this Yeah So this is one column It is called Basement Final sf one wherever you have a 70 square feet then out our little basement square feet So here what I'm doing I am writing a list comprehension I'm saying that I want to filter all the core Lem's which matches a safe and store it in something or less of columns And if I simply run this you can see the columns Sorry These are the columns So how many columns are there Nine columns are there from that Employing an ex Texas That's what is happening here Ah So you are getting one by one night Okay so they're more visualizations but we will not go to everything I mean so you just need to have a basic idea right Okay Oh you didn't see that this thing right Box blurting I think I removed it There is a box float Some already there Right Ah here Here But there'll be one problem Okay So what are you doing here You're staying that box plot You're flirting Exterior and sales price Right Um but again the same problem will happen here because if I go box plot a d X axis If you look you are not able to understand Right So you need to add that line Just get it from here Would be use it scattered we use right I'm not here to be used this one rotation so I just need to add this I guess a scroll down Yeah So now if I added here and then run it now that's more like it No Start branding Uh should I put any night you want Sorry Okay I should said it first Right Bad right Just keep it here OK Ok Yeah So sorry Study Study study study Study Yeah Guarded So why that biggest guarded words So you need to assign it first Because why you need to assign it Because the origin of Clark and or be ordered So we're saving it as a plot And then you're saying that on the label I want to change So this is how the box port looks like And what are we planting here You are plotting the material used for exterior painting against the sales price Right So you can see that So what is the inference you can use from this What do you invest in by this plot Make some inferences from this I'll Claris fine Apart from Oakland What this tells you about the house So if somebody is asking So how is the cost involved in this For example if somebody is using denial the night is here right re nihilists here Right Then what do you have This is what My tail right Metal is cheaper right Because here you have the different courting And there you have the sale price so you can see that denial is actually very high But also look at the length off this guy that also matters So this means What is this thing I don't know plywood or something No semen So this is very big That means a lot off houses Actually in this that also matters Actually right on this door You see they're out layers in this box Plot this darts You see right there the outliers there is a vato represented But normally you see there's a dot So probably you know in this plot most off the Rangers are here but there may be out late here Actually that is what you represent here on If you look at something like this guy this is very costly I don't know what it is Something but it is again very costly because the median value is very high So you always look at this median value and see how the plot is actually going That's a box plot This line This is a standard for you Say access it a condition or this one So this one actually So if you look at some of them this actually encounters the outliers Also So just say is that this is pointing towards out left some of the in general there is nothing to be done The size actually matters For example this data is very big right So if you look at this line you can see that the total enters actually big on this one is very small because the amount of data that you have is very small No this'll one which I don't know I don't think so No I don't think the length actually mattress The median line actually matters where it is Median line is what actually matters I don't think Tank Island actually matters but general considerations like if the amount of data you have it's more than the length will be more torte line emcee thes other values that farts in that range for example of plywood These many values are there You'll get a number But if this is it this this box is bigger More malice of it Nine is just a short of visualization This is actual values are here Only because you look at this guy This is very smart right And this is bigger Soapbox plot actually started to Lane's Actually No no no You should not consider this You should consider this box is actually not these days on also in the middle of what you see this line What do you see That is a median value actually Thistles The range this'll get wasn't it Yes In this age Oh that's what you were asking Yeah So if you look at the whole graph this is the minimum This is the maximum Okay this is the minimum value This is the maximum value and this is clearly an outlier And this is the ah what you say media So most of the points have sent their over here It seems okay So a visualization I think basic mutualization we have covered if anything is Eris spending that will be discussed in your respective statistics classes Right Because you should know what is the difference between excitable explored and the normal course on how it looks like What is the use off it Apart from that conditions and all later when you get the actual data you will be adding and discussing them okay you now need to do is you No need to be able to get the data toe solve this problem So therefore the statistical way of thinking typically says you formally the problem And then you get the data to solve that problem The machine learning way of looking at things typically says Here is the Vita Tell me what that data is telling you Many of my colleagues and I myself have run into this problem when going to interviews et cetera et cetera and so sort off since you shouldn't say that we're not getting jobs out there And so I go toe to toe go to people who are hiding and saying that Why don't you have statisticians and reach an interesting conclusion to this entire discussion that sometimes around the way the interviewer who's interviewing the statisticians for a date a scientist's job asked the question Um here is my detail What can you see And the sanitation answers with something like What do you want to know And the business guy says But that's no I want to hire you on the statistician says But if you don't tell me what you want to know how do I know what to tell you And this goes round and round No one's happy about this in depth process so there's a difference in the week These two communities approach things My job is not to resolve that because in the world of your face You'll see a lot more of this kind of thinking than you seen this because in this world the details chief and the question is expensive And you paid for asking the right question In this world the question is cheapen The duty is expensive You're paid for collecting the leader So sometimes you will be in a situation where this is going to be important For example let's suppose you're trying to understand who's going to buy my product You're asking the question Let's say that my products aren't selling and you want to find out why What will you do Get one data So let's say that you're selling your Oh I don't know What do you want to say Um water sell watches Sit So let's of those people I'm buying buying watches anymore which is a reality correct So you're a watch company who buys watches this entire business model of the watches disappearing Do you have watches Some of you have He has actually surprising number If you have maybe they do different things these days That that seems like a very that that's a fitness device is not really a watch it all So something like this was actually with my daughter at lunch today So she got something like this I'm not sure My my my wife was an entrepreneur Owns their own company She came back from Delhi She came back with two of these I don't know where she picked them up My daughter The first thing she did she took one of this and she took this thing out because you taught the whole responders in on Mississippi idea that didn't occur to it I mean that's a separate thing That's a nice little beautiful Red respondents said so watches different thing But let's say that you're a watch company Nobody's buying Your watches are fewer People are buying your watches Now How you gonna solve this problem How you gonna process this information What do you want to do What do you want to know Good morning Remember I'm asking this question also from an analytical perspective So when you say that to check the model and see what is not so let the shoes the whole data question So you so first order you see sales for whom and when and how How do you structure your data How you How will you arrange the problem Okay that makes problems even harder because now you're going to look for data that isn't with you You know what he's like He's like Maybe people are not buying watches because they buy something else That's a reasonable thing But let's give this problem simple Let's consider only data that is within You will go outside No to it But let's say that I'm looking at my data One data do I want to see and what questions they want to ask It's so sales year by year types And then what comparison's do I want to do the oh region wise Each with what purpose What question am I asking the data water section off Customers are buying my what section of customers are buying my products compared to what What my biggest set off customers So that's also what that's when they go who are my biggest customers That's a very interesting question to ask except that that question implies that I needed to know who my biggest set of customers are could have bean But it's a good point Where is the bulk of my sales coming from Then someone else said something about time you know Is it going Is it going down so you can look at things I think that for which group of customers are my sales going down the most For example you could ask that I'm not saying that's the right question about that possible question to ask So I suppose you follow that approach that I'm trying to understand I know that my sales they're going down That's an obvious thing I see you is telling my CFO is telling me if I don't stop this we are gonna be out of a job Good The H empty factories in Bangalore and not in good shape One of them I think has become the income tax office Let someone in the Polish firm idea so that that's gonna happen to me if I don't do this once I know myself but I don't know by how much And but it could be for hope so obvious segments for which the sales are going down which segments are saying is going down the most In which segments are they going down a little bit How fast are they going down I can push I can ask questions of that Suck Now what conclusions At the end of this do I want to be able to do How do I need to How do I want to use this information You know for this you usually follow something like a three step process And you may have seen this and discovers both these sites and these words should be should be familiar to To some extent the first is called descriptive The second is called Predictive and the 30 score prescriptive Have these was been introduced You at least in this contact list you've really I'm sure you all cruise the web and look at blog's and things like that Nothing new in this I'm sure But I just want to say that context because gonna talk a little bit of what Good Descriptive There's a secret so descriptive predictive and prescriptive What is it Descriptive problem The descriptive problem is a problem that says that this try for me where I'm losing my sales and when I'm losing my cells it just describes the problem for me It tells me where the problem is If you look it's it It isolates it well The predictive problem says Look at this data and give me an idea as to what might happen or what would happen if I changed this that or the other So let's suppose I do the following kind of idea I say that let me relate my cells to my prices Will be trying to understand that if I reduce my prices off my watches will more people buy them on Mostly if I made my watches Luxury items increase the price of a watch removal Lewin brand and make a watching Aspirational thing A decorative item a luxury item a brand item So the people who had a watch not to see the time but also is a prestige statement as a fashion statement What it what it is If I knew this and what will happen that's predictive I'm trying to predict something based on it I'm trying to see if something happens to let's say one part of my data what will happen to the other part of my team and then based on that the doctor carries out the predictive analysis off you Because I see this I now think you have this issue You have this thing going on Let's say I'm diagnosing you as being pre diabetic You're not here diabetic but you're happily on the way to becoming a diabetic now Because of this I now have to issue your prescription I know should tell you what to do So there's a data that comes from you that data in some ways model using the domain knowledge that the doctor has and that modern has translated into a into an action That action is designed to do something difficult is designed to do something actually fairly complicated The first action the doctor transfuse them One Let's do no harm Hippocratic oath First we make sure that that I don't do any unnecessary harm to the patient Then let me shall I say optimizes on her welfare by making sure that I control the black shoes or the best thing that I post form the onset of diabetes as best as I can It's a complex optimization problem of some sort in the business Also it's a complex optimization problem I need to be able to sell more watches but I also need to be able to make money doing so I can increase my series But if I increase my sales and my profits go down on my earnings go down based on the cost and that's the problem But at the same time if I tried to run a profitable business and nobody buys my product that also is not a particularly good idea Then there are other issues We've been running the company I got employees that I want to keep on the boards How do it on the company in such a way So that it means that particularly before us I have finances to take care of five loans to repay How do I get the cash flow in order to repay the bank loans that I have So the prescription is to meet lots and lots off requirements If you're building an autonomous vacant you love situation saying the car has to do this but it also has to follow certain other rules For example if it sees someone crossing the road it should stop but it shouldn't stop very suddenly But if it starts really suddenly it's gonna hurt the car It's also probably going to hurt the driver so it should needs to stop by shouldn't stop too Suddenly it has to follow the rules of the road because otherwise the computer will simply say Oh you want me to avoid the person crossing the road I'm just gonna go behind a bus anywhere Gregory would tell the Carter Please don't do that because there's a house next to it You can't just sort through it all You didn't tell me that You just told me to avoid the person you didn't tell me about the house Okay We'll put that as a constrained in our program and see how well it goes So prescription is problematic No I'm a simple way of doing It might be to say that description is how many centuries is without Carly scored Look up Tricking for it will give you the answer collection might be trying to guess how many centuries their college will score in the World Cup prescription might be How do we get We are going to score more centuries in the world And as you can figure out you're going through a purely beat out based version of the problem It was something that's only know emotionally about the data Peter will help you but there's a lot more than the data when it gets through that what we do today What we do now once they're finished talking to you is well we'll take a look at what descriptive with descriptive part of Texas So the descriptive part of analytics is talking about simply describing the data without necessarily tryingto build any prediction or any models into it I'm simply telling you the way it is This is hard This is in itself not necessarily easy thing to do because you need to know very well how to do that And what are the ways in which one looks activated This is skillful in itself So for example let's suppose that you are you're a you're a doctor you go to the doctor and that the doctor is looking at you looking at your symptoms and the doctor recommends a blood test Now how does the doctor know what black kids to recommend based on the symptoms But remember that potentially there's an enormous amount of information in you All of us is biological things carry any enormous amount of information you know in our blood in our new runs in our genes or whatever you're talking about Big date As I said there's two meters inside every cell and there a few 1,000,000,000 neurons in your head You don't need to go far to see big data You are together You are one walking example of big deal We all are in that big data What you needed out does that Dr No to see That's a descriptive analytics problem The doctor is not doing any inference on it The doctor is not building a conclusion of the doctor's not building any eye system on it but it's still a heart problem was given the vast amount of data that the doctor could potentially see The doctor needs to know that I This is interesting to me and this is interesting to me And this is interesting to me And this is interesting to me in this particular way For example a blood test Let's a full set I drop I draw blood from you for a particular purpose Let's say for blood sugar even you said the biology of how much blood did separate searcher to draw Um what just neither one of you I guess I don't And if you are not this you know prison little so I can say whatever I want You won't understand what I'm saying We know but we're so what time will be enough that this is a real problem for me So you have a You have a large amount of blood that's flowing through you We all do This blood carries nutrients What that does is that every time there is a new trend in flow the blood looks a little different So if you eat your blood looks a little different because that's your blood's job The blood's job is to carry nutrients If you want to run you want to walk If I'm walking around my legs are getting energy from somewhere The energy need to My legs has been carried from the blood and it is being generated through inputs that I get some of it because of the air that I breathe from where it gets the oxygen to burn things something from the food that I've written the nice lunch that I had But it gets the calories to do that So therefore based on what my energy requirements are and based on what I've eaten My blood is not constant My blood content is what is known as a random variable What's random about it Because it looks a different It looks really different all the time Your blood at 12 o'clock is gonna look a little different to look Look at midnight is Well look reunion really different from two o'clock at noon because it's doing something a little different The same phenomenon Is it everywhere If I were to for example measure the temperature off the oi in your car or in your tubular What do you think that temperature will be It depends first of all depends where the car is running or not independent with it has you know not it depends on how much oil is Depends on how you drive We depend on Tim progress to the cut The answer is it depends and the same is true for your body fluids So this becomes a problem because if it isn't random they're from a random quantity How do I conclude what your blood sugar is How does the doctor reached that reach a conclusion of any sort average of what average of particular duration So there are multiple averages that you can get First of all there's a question of saying that if I take blood from you how is the blood usually collected So the flavor Thomas comes and usually takes an injection from one point let's say by some strange accident this Italian based policy by same some strange accident Two different people are drawing blood from two hands at the same time Do not try this at all I suppose they do this well We got the same blood Yes What thing Look at the same time and I said Do not do this at home But the same time you're getting two different samples There's not just a question of time Your blood is not gonna look the same even within your body At one period of time You from the left hand him from the right handed exactly the same time It's not a good look There is a slight There is a snake problem that summer Little of you said that your heart is in the middle Your heart is actually middle but it beats to the left way because the heart is what the heart is both a pump and a suction device The pump side is on the left The suction side is under right so you're black pulls out from your left side and it goes back in on the right side So this ain't a scene between your body between left and right one seconds to go out the other sectors to come in It's light It mixes up all in the middle So one sampling ideas that I'm taking a sample off blood from you and it's just one example The second question is as you were saying it's a question of time You can average over if you haven't your time This a little easier You can say I'm going to do this maybe before eating after reading Real after reading So those of liver blood pressure tests for example Sorry blood sugar test Once they ask you to do it fasting and then they ask you to do it some two hours after eating Do they tell you what to eat Sometimes glucose Sometimes they don't have this sort of say that Based on what you naturally let me figure out what you are processing They expect you to eat a typical meal and not go And you know large amounts of KFC that is not for you Normally eat Just eat what you normally eat vegetarian You'd normally eat normal food and then figure it out Let's see how How about your body's trying it out saying Do a normal thing and I think another normal something Then one of you said something very interesting the average things out Now what does averaging do Neutralise That's an interesting word to use Neutralizes things all right provide context Context off What context is a good point So So what is the doctor trying to do So let's let's simplify things a little bit and say that Let's suppose that the doctor has a special Let's give it a number Let's see The doctor says that if you're black sugar is above 1 40 will do something If you're black shooter it's a lesson 1 40 Margo do anything I don't know what it is right number or not but let's make it up Now The doctor is going to see from you a number It may be a simple reading It may be an average It may be a number of things How is the doctor going to translate what they see from you and compared it to the 1 40 How is that comparison going to be made Exorbitant people So let's suppose I have just one reading So let's suppose that I have one reading and that reading Oh I don't know is 1 35 I've just got one reading from the 1 35 What does that tell you Noticed Requite One target One argument is it's simple Let's take a very machine learning computer science view to this 1 35 is less than 1 40 Ah so now we say Yeah but you know what Let's say that 1 35 and another guy who say 141 20 There should be something that says that this 1 35 is a little bit more trouble Then 1 20 closely for that special lady says So maybe you know there was This special isn't quite as as simple as I thought it was Second solve this problem in one of two is one way to do this is to make this 1 40 a little range This summer is called fuzzy logic you know it was a question you're asking becomes fuzzy not as crisp You're not feeling with the data You fiddling with that bounder you're fiddling with the standard The other way to do that is to greet a little uncertainty or create looking plus minus around the reading It set fire on 1 35 saying that if this is 1 35 and let's suppose that I go and get another reading and the second reading that I get is say 1 30 on the tone reading that I get on the day after that it's a 1 32 and I'll see you OK it seems to be fine I might say but let's suppose after 1 35 But I got wasn't I do go a usual thing and I measured again and this time it comes out as 1 57 on I'd do it again and it comes out as 1 28 and I do it again It comes out to be 1 52 So in both cases one tiny I was probably a good number but in one case is 1 35 was wearing very little when the other cases 1 35 was wearing a lot which gives me different ideas as to how to process it So what Destructive Analytics talks about essentially is trying to understand certain things about data that helps me get to conclusions off this kind of little more rigorously Now to be able to quantify what these plus minus is our is gonna take a pick us a little bit of type and we will not get that this residence We'll get them next residence to say that in orderto in order to say I start 1 35 135 plus minus something That question now needs to be answered But to do that I need tohave to particular instruments at my disposal One instrument that I need to have at my disposal is to be able to know what to measure I need to say what doesn't mean I need a statement that says that maybe 95% confident that something is happening I'm 95% sure that this is below 1 40 I need a way to express it And that is the language of probability Who So what we will do tomorrow is we introduce a little bit of the language of probability it beats are unrelated to what we're doing today So there's gonna be really good for disconnect But what we're going to do is we're gonna create two sets of instruments one instrument that is fairly descriptive in nature and one set of instruments which is purely mathematical in nature so that I can put a mathematical statement on top of a description And the reason I need to do that is because of your description is not helping me solve the problem that I've said It said that I have said so Therefore what will happen is you will see in certain medical tests You will not see points like this You will see in Devil's your number should be between this endless You're close your number your HDL whatever should be Between this Since you won't see a number you'll see a range The people typifies a variation and in certain cases you will see specials And maybe it's just a lower limited upper limit But it also see a recommendation that says Please do this again You know what I'm going to compare I can't compare one number to one number one number one number is typically a very bad place for any kind of analysts to be in because you got no idea off which is error prone and where the air it is So therefore what happens is you try to improve one of those numbers and so either by filling around with range or by getting more measurements And you'll do that and you see that as we go along a little bit So this is a context for for what we have in terms off terms of Get out Let's see So this is a set of files that has bean loaded It's a very standard set of files It's not mine To be honest I just want to make sure that I'm doing what I'm supposed to be doing So for reasons that are more to do with security my understanding the notebook will not access your drapes So keep it on your desktop and not complicate life So and there is this notebook It's called cardio Goodness of good What statistics your first to the idea that this comes from the statistical way of thinking which as I said opposed to the machine learning we're thinking is against to be a little more problem First Data next which means we worry about things like hypothesis and populations and sampling and questions like that And the descriptive part refers to the fact that it is not doing any inference It is not predicting anything It's not prescribe You think it is simply telling you what is there with respect Oh certain questions that you might possibly ask for What is the context to the case The market research team at the company's assigned the task to identify the profile of the typical customer freeze treadmill product offered by the company Democracy System decides to investigate whether their differences across product line with respect to Castro characteristics Exactly what you guys were suggesting that I should do with respect to the watch understand who does what Entirely logical The team decides to collect data on individuals who purchase a treadmill at a particular store during the past three months like watches Then I'll collect looking at data for presidents and that is in the fight in the CIA's you fight So what you should have is you should have a C S V file in the same victory and to the magic of bison You don't have to worry able to exact path before we get there Remember because we're looking at this statistically before we get the data we should have a rough idea as to what we're trying to do And so they say that here the kinds of data that we're looking at the kinds of products the gender the ageing years education years relationship status annual household income average number of times the cost of plans to use the treadmill each week average that number's a customer expects to run walk each week on a self raided fitness skill and want to fight where one is in poor shape and five is an excellent ship Some of this is data Some of this is opinion Some of this is opinion masquerading as Gator like for example a number of times the cost of plans to use the treatment Hopefully wishful thinking It's still big guy You're asking someone How many times will you use it Heroes Billy No problem Seven times a week Oh we'll see well but still it's come from somewhere So So what has happened The way to think about This is to say that I want to understand a certain something and the sudden son certain So something has to do with the characteristics of customer customer characteristics And to do this you can then use either you can either take let's say a marketing point of view Who buys it would also take a product engineering kind of you What's is it was what kind of product should I make it Sit in business as you probably for those of you on anything Entrepreneurs one hand up one hand of the closet enterpreneurs from when I could figure out Sometimes it's unclear what that word means You know it was You think you are confident enough to call yourself one Oh you're doing that ineighty space Ifyou're entrepreneur for example in physical product space or even in software space one of the things you often think about is what's called a product market fit which is you're making something How do you match between what you can make and what people will buy Because if you make something that people do not buy that doesn't make any sense On the other hand if you identify what people buy and you can't make it It also doesn't make much sense So the conclusion that we were drawn this we will not drawn today But the purpose is to be able to go towards the conclusions off that kind Either isolate products isolate customers and try and figure out what what they tell us Pandas generally has a fair amount of statistics build into it That's what it was originally built for number You something that was built wolf or mathematical problems than anything else So some of the mathematical I'll go to them said I needed out there There are other stats I plots and metal a plot life or see bone or many other things that you've seen already Um by tennis shoe figuring out how to arrange these libraries well enough the shall we say that the programming biases sometimes shows true in the libraries So I for one do not remotely know this well enough to know what import up front but a good session You know what to import up front and you do all this a friend so you don't get stuck with what you want to do The meaning is up to you If you like the names as they are then that's fine You want to stand instead of names So when you don't have a doesn't If this isn't the right part just this will work not see as we It's usually smart enough to convert Excel forms into C S V You know as if you have this is excellence and things like that It's usually smart enough but if it isn't then just going and saving excellence Fiza CIA's refiling operate that way in case it doesn't do it on its own but more often the war to seize every new when you win When Jupiter sees it it is seeing any excellence File AUSA CSU find Oh going make the change yourself all you can have other excellence other restatements in it as well You can change functions inside it and you can figure out how much to head What this tells us is the head and the tail of the data This is simply to give you a visual ization of what the details This gives a sense of what variables are available to it what kinds of variables they are Well we'll see a little bit of a summary after this etcetera So for example some of these are numbers income What is income income is annual household income That's a number Some for example it's a gender male Female is a categorical variable This is not incurred as a number It's entered as a text field If you are in except for example right at the top If you go in and you see that really tell you how many distinct entries there are how many district settings that are so usually what happens right in the beginning and a date A frame like this if it is created is a data framing for data Frame is created Wind gets created The software knows as to whether it is talking about a number or whether it is talking about categories there certain challenges to that You can see one particular challenge to this What does this 1 80 mean counts Why do you think there are so many dismal places that comes here 14 years of experience 16 years of experience twice it going zero Yes Indust This because it sees other numbers with those adjustment places are needed So what it does is what any software typically does is when it sees data It's sort of says that at what granularity do I need to store The data sometimes is given by a computer You're 60 for better or 32 bitten things like that But what it does is it means that the data is stored in the date of fame to certain digits a usually euro See that your seed in this way for sometimes for example when you see include equal to or any and you ask for a full description The data comes out in the cycling irritatingly because of something here because will say the income field or any of that no when it recommends when it looks at the restrictions of this what is the description that it is reporting and how does it choose to report out the description with this particular situation Swiss taken a bit of a closer look at this one thing Here look at the weights Junctures of count unique top frequency and under certain things here means standard deviation minimum 25% 50% 75% And Max When it sees a variable like gender it reports out lots and lots of any ums One is actively right of the back It can't do that which means this order number This not a number In other words if you ask me to find the mean of something and you're giving me male and female as inputs I don't know what to do which is an entirely reasonable stand to take for any reasonable algorithm It requires another kind of destruction for it to work but the problem will destroyed this course in taxes that is asking for the same description for all of them Whether it's a significant digits weather is um columns at city she's shows in this destruction It says that that's all that I'm going to give you but where it makes sense less There for example I look at H for each have got 180 observations and it is calculating certain restrictions for it Correct So one of the destructions in is calculating Let's look at these is calculating restriction like same minimum minimum is what 18 maximum is 50 These are easy to understand Mr Got something a little interesting Suppose I want to report one number one representative each for this data set This is like asking the question How do I get a representative black sugar number for you I can give you a minimum and a maximum but to do the minimum maximum money to draw blood many many times from you But let's suppose I wanted this is why I want one representative age for you Somebody asks you what is your blood sugar You want to give them one number Similarly somebody's looking at this data and asked the question Give me a representative age How old is your typical user or what age do you want to build it for Or you're even asking You're even asking Listen product question You're a product designer and a product Designers building a treadmill Now how you design a product Those a few engineers based on the very good What Wait Who's way How was the user What is the weight of user He's got a good quite as as a design engineer I need to know what we It will be on that treatment question mass The So there's the question of saying that if I want to measure a variable by one number house would I even framed that question What makes sense What is the one average No match in this possibilities you might argue the max is the is the right number because I want to be able to say if I can support him a key support anyone But there's also a downside to that Have now engineered that product You could argue that have shall I shall I say over engineered that product electricity okay on it So let's suppose that you are You are doing this for a mattress You all sleep on Mattresses were all relatively wealthy based on the fact that we're here's we probably sleep on a mattress when everyone's fortunately after sleep on a mattress But let's suppose you do sleep on a mattress How much which should that mattress be designed to bet if you over engineering what will happen is the number one for a reasonable wait Let's say wait a lot below that That mattress is not going to sink Cliff See that you desire for 100 kilos Now if you're 50 kilos of 60 kilos that matches is not going to sing for you it is it gonna feel comfortable for someone who is 100 kilos And for someone who's 50 kilos which has gotta bounce on it you're gonna feel it's soft silky nous of whatever it is you want to feel from the mattress It won't work So what to do That is the destruction with a heart problem who do engineer for And so therefore people have different ranges of what I mean to represent it So here's one version of it This is what is called a five point summary I report out the minimum the 25% points the 50% point the 75% point and maximum variable by variable I report five numbers I report the lowest What is 25% means 25% of my detests it or the people are younger than 24 The youngest is 18 25% or 1/4 of them are between 18 and 24 1/4 between 24 26 1/4 between 26 cents 33 a quarter are between 33 50 This is what is known as a distribution This is what is known as the distribution of status stations Love distributions They capture the variability in the data and they would do all kinds of things with it so I would have brought typical shape of a distribution Will will make more sense if it later on This is theoretical distribution and distribution For example let's say has a minimum as a maximum has say 25% point as 50% point which has 75% In terms of probabilities This 25% here 25% here 25% here If you want to think in terms of pure distribution is now a probability is just a proportion if you want to think in terms of probabilities What this means is that out of 180 people out of 180 people if I draw one person at random if I draw 1% random There's a 25% chance that that person's weight is going to be below well 24 Nothing in the probabilities will do that tomorrow But this is the destruction So what his destruction does is it gives you an idea as to what value to use in which situation So for example you could say that I am going to use 2026 as my representative each If I do that What is the logic Amusing This this 25% this 50% point so to speak This is called the median This is called a medium medium means the each of the average person first sought take the middle person and ask How old are you The each of the average person I could also ask for the average age of the person which is what Which is the mean which is one over M x one plus X N Now this is algebra What you have to do is you have to put any cool to 1 80 This is the first stage second age the third age up to 1 80 on by 1 80 age one plus age on 80 This is called the mean This value is what 28.79 the average It is about 28 years or 28 a half years 28.8 years But the Asian the average person is 26 Yes the difference between the two trying to So I described the medium as the age of the average person and he described the mean as the average age of a person though he's looking at me like saying You have to be kidding me That's confusing I admit to it The easy way to understand it could be this What is the mean at the mall Up divide by how many there are What is the medium sought them from the smallest to the largest pick off the middle If they're an even number what do you do You take the average of the two middle ones if that the same will be the same number If they're not it'll be a number between them So sometimes the median may show up with a 0.5 or something like that For that reason if there is an interior accounts but there are even number of counts now Which do you think is better You're giving the depends You figure out that they like that answer We won't make sense We both mixes It depends on what what context should going to use it for in certain cases Yes is the spot with If you're talking terms of PanAm eaters so used an interesting term he saying What is the paramilitary I am after Parameter is an interesting word Parameter refers to something Would engineering a population is an unknown thing I'm trying to get after for his number Block sugar is a parameter it exists but I don't know it and trying to get my handle on it correct Three Find singing in terms of parameters These are different parameters so less Let's look at a distribution here and not chivalrous and pick up things also So the median is is the median is a parameter such back on this side I have 50% and on this side I have 50% This is the media The mean is what is called the first movement What that means is think of this as a fleet of metal and I want to balance it or something Where do I put my finger so that it balances It is the c g of the data the centre of gravity of the data You can understand the difference between these two now if for example I push the data out to the right what happens to the medium Nothing happens to the medium because the 50 55th remains the same But if I pushed the net out to the right the mean will change It will move to the right Your liver the liver principal right If there's more weight on one side I have to move my finger in order Counterbalance that weight So these are two different parameters If the distribution for example is what is called symmetric symmetric means it looks the same on the left as on the right Then these two will equal because the idea of going half to the left and half to the right will be the same as the idea of where do I balance Because the left is equal to direct So when the mean is not equal to the median that's a single that the left is not equal to the right and when the team is were a little more than the medium It says that there is some data that has been pushed to the right and that should be something that you can guest here because the mean in the medium to some extent at what 20 for 26 the lowest is 18 That's about 66 years eight years less than that But what is the maximum 50 That's 25 years beyond The data is pushed to the right amendment Is racing pushed to the right direct technical term Is rights cute There there are shall I say people are more not average on this on the older side than on the youngest there was a hand up somewhere with So therefore one reason that the median often doesn't move is because it is not that sensitive to out liars select Suppose for example We look at us asked and we ask ourselves what is our mean income or a median income And we have that each of us make a certain amount of money We can sort that up in separates that Mr Mukesh Ambani walks into the room No one has been happen to these numbers means to cooperate He alone Family makes a very large multiple of all our incomes put together Possibly I don't know how much you make I know how much but what's going to happen to the medium is going to stay almost the same The typical person may move by at most half Who's warning Typical person Going to be difficult person is going to be an actual individual in the room or maybe an average of two individuals in the road and that person on gonna change That's one conclusion we condone drawn this any other floods below which will also show the same thing You not being will draw the conclusion Good logical reason Haven't shown you the full data will see the history and will do that So hold on to that question The conclusion was drawn is that they're two pieces There are two things to do See here One is if I simply look at this without seeing any more graphics Where is the middle of the data from medium perspective At 26 now from 26 look at the difference between 26 the smallest 18 between 18 and 26 That's eight years This eight years contains 90 observations because as 180 total what is on the opposite side of this 26 2 50 That's how many years this 24 years now contains How many observations Same 90 So the 90 observations that are between 18 and 26 and they're 90 observations between 26 50 So if I want to draw pictures what would what would that picture look like Yes exactly as you're drawing it This usually by definition is called This is a problem that baby has because this minutes left studio rights cute as a world like it's called rights Cute more dated to the right More data is a dangerous world Know that it's the same number of observations as say the data is pushed to the right movie oration right side It's obviously for way of putting it Yes so Souness is often measured in various things One measure of Souness is typically for example mean minus medium mean minus medium If it is positive it usually correspondent rise Que nous We minus median negative really corresponds to lefts Curious This is a statistical rule but sometimes it is used as a definition for stillness Many definitions for SK Eunice Stewed data sometimes causes difficulties and analysis Because what happens is the idea of variation changes being the radiation One side me something real different and variation from the other side By the way what's happening to you with respect to things like books Are you getting books not getting books I have no idea what the books are You got one book Which is what Which is the statistics book Okay I'll take a look at that book later so okay Comment One train eyes book coming to Not a pie phone book does it make it a bad book Sophie Looking for help on how to cool things up This is not the right book Get a book like things tax or something like that If you want to understand the statistics side to it is an excellent book Everything that I'm talking about is going to be here Um I talk about which chapters and things like that at some point and I might talk about how to use this in the book So for example at the back of this book their lot from their tables tables at the back of this book which will learn how to use And then I tried to convince you that you shouldn't use them But remember many of these methods are done in ways in which either you don't have access to computers or if you do have access to computers you don't have them shall be said run time in others When I want to run the application on taken build a model using a computer but I can run it within one The runtime environment for statistics is often done when there are no computers around The building environment can include computers for the runtime environment Cannot lot of 36 done under that kind of situation even Okay so definitions of SK Eunice and do it do it in the way usually use a book which means you go to the index received The word is there When you go back and figure it out it will give you some ideas as to how that works It's a nice book is on the best books that you have in business statistics but is not necessarily a book that will tell you how to court things up That is not a deficiency of the book Not every book can do things of that sort and other books around that will tell you how to court things up but will not explain what you are doing It's important to know what you are doing It's also important to know why you doing it But books can't be written with often everything In my guess the thinking is here I think this is good for thinking I would absolute recommend this book on the thinking site because yes and that answer I think is very very good here Well you won't get is it'll say Do this and it won't give you the vitals in tax to do it That that will not be here difficult for that problem through some other means I used have a colleague a name corporate life who had a very big sticker on his board It said Google searches not research Nobody agrees with him anymore So I suppose that when in doubt you do our normal homos appeals do today which is you Google for an answer Truth So one possibility is that you you understand something from books such as this And if you want to understand the sin taxes Google for the term say Faison that term whatever we will probably give you the court Things are very well organised these days There's also the question and I should give you a very slight warning here but not to discourage you from anything but in the next nine months So they're about the duration of your programme There's going to be a fair amount of material that we with thrown at You correct The look and feel will sometimes be like what we would what we would often college Um it is drinking from a fire hose You can if you want to where you get very well So therefore pick your battles If you want to understand the statistics side of it please please going to the death of it But if you try to get into equal deaths on every topic that you want to learn that will take up a lot of your professional Now the reason we do the statistics for 1st 1 it's a little easier from a computational perspective all the harder from a conceptual perspectives We begin at this way but hold on to that idea And then as you keep going see if this is something that you wanted done more on And if you can you welcome to try to us So let us know all at anyone No weather's just come in let her know and will get the references But if 12 1st day for the first residencies with can see what happens there Yes Mr It's a well written book it's in If it's it's It's Instructor is one of our colleagues here You know it's one who we can also help explain things So So this is the summary wanted The summary tell you this summary gave you was called the five numbers five numbers that help you described the data minimum 25 50 75 Max will see another graphical destruction of this It also described for you eh meals There is also another number here and this is this number is indicated by the letters STD yesterday were first to standard revolution yesterday refers to standard deviation and what is the formula for a standard deviation Yesterday is equal to the square root of but two steps Step one Calculate the average Step two take the distance from the average for every observation Asked the question How far is every data point from the middle If it is very far from the middle say that the deviation is more if it is not far from the middle say the deviation is less deviation being used as a synonym for variation Various invasion can be more or variation can be less more than the average less than the average if someone is much older than average This variation if somewhat is much younger than average There is radiation so therefore both of these are variation So what I do is when I take the difference from the average I square it so more than expert become as positive less than expert also becomes positive Then I added of May average it There's a small fish use to why it is and minus one And that is because And I'm divi I'm staking a difference from an observation that is already taken from the data now ever square when I have squared Buy additional unit was in each When have squared This has become each square So I take this query to get by measure back into the scale of years so the standard deviation is a measure of house spread A typical observation is from the average It is a standard deviation where a deviation is how far from the average one And because of the squaring you need to work with us square root in in sort of modern machine learning people Sometimes you something called mean at an absolute deviation Amy Mad very optimistically called sought man his he says You don't take a square you take an absolute value and then you do what have a square root outside it and that is sometimes used as a measure of how much variability there is So why displacement We square it because we want to look at both positive and negative deviations If I didn't score edit what would happen as it would cancel out What was the word that one of you used neutralise right I'll affect er you're positive deviations would neutralise your negative deviations Yes this members is going to be positive if say expands So let's look at the first number here So if I look at the head command here when I did the head command here What did they What did head command Give me the 1st 2 observations Now this is an 18 year old This probably sorted by decision 18 year old Correct I I am trying to explain the variability of this data with respect to this 18 year old What is what is what wiser variation This 18 number is not the same as 28 18 is less than 28 So what I want to do is I want to go 18 minus 28.7 What I'm interested in is this this 10 year difference between the two now the first in the oldest person in this data said is how old 50 When I get to that row this 50 will also differ from this 28 22 years some interested in that 10 And I'm interested in the 22 I am not interested in the minus general minus 22 I can do that I can do that I can do is I can look at I can represent 18 minus 28 as 10 mhm represent 28 minus 50 is 22 And that is this As I said one over in minus one Absolute X one minor sex bar plus plus absolute extend minor sex for that Is this within minus one And this is done as um say this is what is called mean absolute deviation and miss many machine learning algorithms Use this You are correct In today's world this is simpler No wins Standard deviations came up first This was actually harder But people would argue about this I think Well 150 maybe more about say forget my history that much There are two famous mathematicians one named KAOS and one name plus who argued as to whether to use this or whether to use this lap lasted You should use this and God said you should use no reason KAOS one was simple because KAOS how did easy to do calculations Why's this easy to calculate with because new tenant come up with Captain Ursus in a century or so before that And so for example let's suppose that she want to minimise variability which is which is some something that we offer need to do in analytics which means you need to minimise things with standard deviation which means you need to differentiate This functions the square functions different ship You can minimise it using calculus This is not so Therefore what happened was gone School you calculations Plus could not plus lost And the house one The definition of the standard deviation haven't much use 25% of 75 Okay Okay Why do we not do that So today this entire argument makes no sense because today how do we minimise anything Our computer programme You don't use any calculus you asked You run if men or something of that sort You basically run a programme to do it So therefore this argument that you can both to calculations equally well with this as in as in that So today what is happening is at LA classes Way of thinking is being used more and more This one is a lot less sensitive Fill out liars this one What it does is if it is far away the 22 squares to for 180 for or something like that which is a large number So the standard deviation is is often driven by very large deviousness Larger the deviance the more it blows up And so therefore this is often very criticised If you read for exam to the finance literature this guy called Talib the same calibre he rices book called The Black Swan and full by randomness where he left him had criticised the standard deviation as a measure of anything So today this argument doesn't make a great deal of sense and when in fact is something like this makes sense it's often used to so a lot of this has done Historically it looks this way because of a certain historical definition and then it's not is hard to change So today in the centuries after the House SAT people like me are trying to explain it and having trouble doing it because there's a logic to it and even and that logic doesn't hold any more How far how far on the average is an observation from the average confusing statement again But how far on the average is an observation from the average if that also zero that means everything is at the average But you asking the question how far from the average is an observation on the average If I take your blood pressure how far from your average blood pressure Is this reading If this is exactly equal then I don't need to worry about variability Every time A major black official Citizens what is your average bank minutes Don't tell me that but but you know what I mean You have an average bank balance your bank account manager or your bank actually tracks this what your average bank balances but you are Actually your balance is almost never or very very really equal to your actual average bank balance It's more and it's less how much more How much ness is something that the bank is also interested in Know to try and figure out how much of your money so to speak to get out there Who's the bank is going to make money by lending it out but when it lends it out it can give it to you Said makes an assessment as to how much money I don't get the finance now but you get the drift So therefore there it is a measure that it is on the only measure of that So for example he has another measure So November this 25 number and the 75 number that you're asking about Let's see them a carefully Remember that looks like this Let's say 33 minus Let's the 33 75% point minus 24 2 33 minus 24 Let's say this is my 24 this might 33 between this How much Neutralise 50% Why Because this is 25% and this is 25% This now contains 50% This is sometimes called the inter quartile range Inter quart times rage Big word Why is it called an inter quarter range The reason is because sometimes this is called cute three And this call Q one Q Chief's chance for upper quarters You can understand quarter eyes so upper quarters and this is the lower quarter and the difference between the upper quartile in the lower quarterly sometimes called the inter quarter Why is it called the range Because what is the actual range of the data The range of the data in this particular case is 50 miners 18 and 50 minus 18 which is your max Minus your mean This is simple sometimes simply called The range ranges maximum minus minimum inter 40 ranges upper quartile minors lower quarter and these measures are used They do see certain uses based on certain applications You can see certain advantages to this For example let's suppose that I can operate my five point summary with my five point somebody I can now give you a measure of location which is my medium and I can give you two measures of discussion which is my inter portal range and my range So those five numbers have now been twisted Give me a summary number which is the medium and arrange number Interestingly I can also draw mental conclusions from that For example I can draw conclusions from these five numbers in the following way 24 33 Half my customers are between 22 24 33 So if I want to deal with half my customers I need to be able to deal with a range of about nine years Within this nine years is all that I'm interested to get District So if I'm building my if I'm building my mike my machine I am going to make sure Let's say that the 33 year old is okay with this and the 24 year old is okay with this Will the 50 year old be okay with this But if I want to bring the 50 year old Loki with this and of trouble with the 18 year old second will not with even these five we see more district of status is as we go along By the way this is only for H I can do this for you know use age I can do this for fitness I can do this for income into the income is interesting Here's the median income $50,000 the mean income about $53,000 If you see income in almost all real cases the mean income is going to be more than the median income The per capita income of India is more than the income of the typical Indian What does this command do if I say my date are not in for what this is due his mighty it a festival is the need a frame that had created just review I read the pdf file this way This is a destroyed and this year is in for destroy I been in foreign English language are similar things Destruction information this is interpreted in the software is two completely different things Information is like your variable setting is securing your field Uriel Field is setting like that is giving you information on the data as data The word data means different things to different people To Stannis station data means what crew sensation data means and number to an IT professional What is data mean Bye bites information You know I have lost my data Are you particular Can't want the day ties of lost my data So this is that information tells you tells you about the data It's an object is the description is the 64 bit storey in creature who is an object with tells about numeric categorical It tells you about the kind of data this available nominal feels in Other was there are objects in the field etc There is so many in teacher types which has stood at 60 for because this computer is probably capable at 64 there are three categorical variables This is this is shall we say a data on shift summary of the of What is there in that data Not a statistical some useful in its own way particularly for processing it and storing it for those of you are going to go into data for of curation like careers This kind of a database is a nightmare because typically what happens is when you store real data Yuki in addition to data you of his store was college did a dictionary Sometimes that's referred to is meta data data about the data he was simply Storing a bunch of numbers is not enough You have to say what the numbers are about This as a layer of complexity to the meta data You now have to store not only what the variable is about but what kind of a vary abilities so many professional organisations say is that archival data should never be a mixture of both numerical and categorical objects and they pay a price for that numerical things should become categorical or categorical Things become numerical But what happens is if you are storing large volumes of it and archiving it and make it available for people who are not seen it before it sometimes gets convenient So therefore feels like this is often useful to see how bigger problem you have now I would applaud a few things to plot You can flood anything Simone I think is coming ready later But this plot this is from medical of library and it is plotting through a command called hissed His means hissed O gramme which have already seen if commodity programmes and I think the sinister So this is his programme the history Graham as a syntax has been sizes and figure sizes So what you can do is you can pair on with these a see the differences in what this is to Graham does But this is certain default that shows up and that default is quite good And here is a history Graham distribution of the age This is not a set of numbers This is us Picture This is a picture wonders this picture have this picture has a set of bins It has set of accounts within each bins between these two numbers between 10 and whatever this illicit 22 or thereabouts I have account off that says 17 should gives account And it does this by getting a sense of how many been xerox and plotting this ship It's a little bit of a lark to write in his programme programme There there's a a pipe and book out there acting things chatter One of it in which sort of the 1st 1/3 of the book is basically how to write a His to Graham Court is a wonderful book but because it freezes example it got terrible reviews reviewers said Why do I want to learn how to coda his programme and the book's author is and teaching you how to write a coda His programme is um you an example How to do that And I tend to agree if you want to test yourself of your understanding of data and your understanding of any programming language and any visual isation language code and have fun So it's a nice challenge from many perspectives The data challenges the language Shall is the visual isation challenge They want archival leader to be have only one data form only one format Why is that so Because as it said when you store data how do you store it listed that you've generated analysis analysis has done correct and you've decided not to destroy the data you going to keep the data in your company's databases or in your own database How will you keep it You can take it Technology lets the list list Bacon Example Let's say was pick an example sequel Excel Whatever Let's keep it in excel If I keep it in Excel what will I now do So let's say I have an excess five She lets see my cardio data solicit this data Now in addition to the data one Do I need to store with area Yes So one possibilities I can have a text file like that like I had at the top of this describing all of this which is typically what happens in extra storage It describes this and described There's 15 called dot Backed and another file called dot discover something of that sort which basically describes the variables in the ideas that they have the same name and one extension gives you the data the extension gives you The description of the variables are in this data Now This is good Now what's going to happen on the data Certain food has been run That court is going to assume certain things about the data What you want the court to assume about that data Whatever you want that court to assure about that data should be available in the data dictionary Now if that court is stable enough to realise that whatever feels you give me I will run on that school But if that cause requires you to know what kind of data is being used Leslie District Data let's say continuous later in the future you'll be doing since that linear regression logistic regression linear regression will make sense If the variable is a number legislation Grecian will make sense of the very belisa 01 If you have that problem now in the media data you need to be able to tell not only what business information this variable contains for also what kind of computational object it is So the court can run So therefore what people often says that I am going to make it very simple And I'm going to assume that my entire data frame consists of only one kind of so that when I ran in Al Gore is um on it I know exactly what kind of data input that algorithm is going to get what I'm saying is a practical answer that many companies often often have And I worked in a couple of companies at his one company where this was very seriously so we had to wear two When we put data back in we had to convert it And in the situation that I was in it wanted everything in categories So what we would do is we take continuous data and we would do was call find classing which means that we would divide not into four pieces by into 10 pieces This I want decide to SLC decide for up to decide one and every variable was stored now not in its original numbers but has 10 98765432 So the support that I tell his income is nice What that means is I know he is in the ninth decide 10% of the people or more have income more than him 80% have less than him is in that racket Had all variables were stored that now what happens is every algorithm knows that every variable is going to be stored that and you can keep rating algorithms Otherwise what would have to happen is every l Gordon the money to be differently Elissar you doing credit scoring They say you're doing Sciarra models you know is something of this sort And you build a very sophisticated Sierra model that tracks your customers and it works now 70 year about a new variable coming in the twitter feet and suddenly nothing works or to do go back and rebuild that entire model that's going to set you back 34 months is going to set you back a few $1000 So you say no Any variable that has to go in has to go in in this form And if it goes into this form my algorithm can deal with And in fact this I am going far away from topic now in practise and professional list has to struggle between doing the right thing badly and wrong King Well you want to do that But then I think well is going to cost you time money data and everything Swiss channel Between saying that get a floor model quickly built on a new data set All I am going to get an inefficient answer on a model that's already been done Ellis See how far Inco's and these are More cultural issues with Howard and analytical Solution is often deployed in cos they very very much from industry to industry They very very much from company to company from the culture of a computer cultural company they defend on regulatory environments is certain environments and auditor like entity comes in an insistent senior data Show me your data Listen finances sometimes happening Wrigley Treaties with the Reserve Bank of India goes into a bank and says Show me your data All this MPs as such as a show me your order book Show me alone book Now that has to be done And the decisions you made have to be done in a way that is patently clear Why you have done this so very often People say I want to make the best rest decision I wanna make the most obvious risk decision which may not be the same thing at all but I'm being audited So that's a practical question and I don't have a clean announced it to that But I do know what happens Is it right No it's not but we live in a world that has a kind of imperfection Why one of my teachers His name was Jerry Friedman You see some of his work Later on he came up with Al Guardians That production pursued cart Mars ready and boosting He created many of the algorithms that will be studying one of my teaches at Stanford When he ran our consulting classes he would say this solve the problem Assuming you are an infinitely smart client and infinitely fast computer after you're done that solve the real problem when you do not have an injury Really smart client and you do not have an infantry first computer This was in the early 19 nineties or computer speeds were a lot slower We will have powerful machines like this around So a lot of this is done in in that kind of situation where where you are where you are struggling for continuity When you figuring it out imagine yourself as in analytics manager and I hope mean if he will be and human analytics team sitting in front of you you looking at them any looking at them in the eye and you know how much you're paying them and you know that half of them are going to leave at the end of the year Are you going to do with regard to the modelling and things like that Your first order of business is going to be to ensure continued to in some form keep it simple rights Keep it simple keep it obvious for the next bunch of people were going to come in and for that you'd be willing to trade a little bit off Make it right So now the news person coming in will now not want to solve a very complicated kind of situation This is not where you want to be but and I do not want to depress you on Day one But it's also the fun part of the profession It is also interesting and extract are interesting and exciting So the history Graham Command summaries of what the programmes are in each gives your central water distribution is And as you can see from most of these pictures most of these variables when they do have ask you tend to ever X Q Maybe education has a little bit of a left skew Maybe educational liberal will ask you that a few people are educated most people are here But even so who now here's the interesting plot Met Mark Life has This is well but Simone has a better version of it This is what's called a box plot you've seen above There's a box plot People are unsure as to where his box came from because as a sensational box who's used this before But this bus came from used to be called a box and whisker plot this in the viscous This whisker will go This is This is the medium This is the upper quarters The top page of the box The bottom edge of the box is the lower quarters The end of the whisker is 1.5 times the inter portal range above the box If you want a formula sort of the whisker The length of the whisker is 1.5 times a Q R Should have a break now My Olympic Maybe her so will book and 3 45 were will be there I haven't stopped Righteous got distracted so 1.5 times the If it goes up to that if a point lies outside it the point is shown outside If the data is before it the whisker also ends What is the What is the Let me explain another way Do Oscar is the maximum the top of the whiskers A maximum The bottom of the whiskers The minimum not okay Booking the so this point here What is this plot Year Age for Mills So this means is this is the minimum 18 of whatever it is on this the maximum 48 or whatever it is minimum the maximum So if you see nothing else on the box plot no other points other than just the box and the then your five point summaries sitting there What happens if you see points like this outlines what is an outline And out there is a point that lies more than 1.5 times the inter portal range above the box So this whisker will not extend the indefinitely It will go up to 1.5 times this box and they will stop And if any points are still left outside it will show them as dot you can treat This is the definition for what an outline is Say anything and generations the metric that means this minute has it is a e can change it You can I won't try it now but even go to box first syntax and change that You can go to box flood syntax and you can change that 1.5 It's not hard coded into the algorithm I think 95% sure as China Station and never she rode anything But it's a parameter in the in the issue of you know passive animated in the spot function Default is one point You should be Ways to these two colours are because have asked for two things have asked for male and female If I if I had three of them through this this one here this is Q three the lower Rescue three and the upper is low Rescue one on the upper rescue Three for meals between the bottom bottom Whisker to the end of the box is 1/4 of your data The box is half your data and the top of the box to the end of the whisker is quarter a few days The the media there is also function in box plus you can pay with where will give your dot and that dot is the mean you can you can ask box brought to do that But I mean is not as generous Standard component in the five points summary is a different calculation Not a sort But if you want to you can make box spot to give a dot on the mean as well By definition yes years so mean meetings So half the data is between 24 34 hours whatever that is half of all like all the men in my sample are between those two numbers I think box front doesn't allow you to change the shape of the box I think that is set That's our central to the idea of a box It does allow you to fiddle with the size of the whisker right Don't think it allows you to fiddle with the size of a box There was if you change that to something else let's say the 20% point to the 80% 800.80 20 rule that's no longer a box pluck is another interesting plot The significance of it is exactly this as we have seen before The significance of it is is that the gator looks like this It's rights Cute Think of the picture So this is your Q one and this is your Q three This is your cue to other medium Then the median is going to be closer to queue one than it is to queue tree in the same way that the minimum will be closer to the media and the maximum same idea This is a summary isation for numbers If you want to summarise for categorical data what's called across cap or across Cam you elation This is simply how many products of this product category 1 95 for 98 7 98 There were three kinds of treadmills and they're trying to understand which who was using what kind of treatment our business problem is to understand who was using what product this is across step What is this This is something that will be used for categorical variables No box spot will make sense here Hello Numbers You can ask interesting questions here if you want to and you can think about how to answer it is that for example can ask the question Is there a difference between the preferences of men and women possibly Is there a difference in the the irrespective of gender is a product that they prefer Then ask all kinds interesting questions and you can find ways to answer it which we will do not in this presidency for next time So this is simply once again this is descriptive All this is done is it has simply told you the data as it is But I'm saying is that if for this if you want to do a little more analysis on it you now have to reach a conclusion based on it So for example one conclusion to ask this is that is that do men and women have the same preferences when it comes to the fitness product they use No that's a question to Also that question is enough to look at the data but just looking at it will not give me the answer I need to be able to find a statistic to figure that out a statistic that does what that is Some way measures that difference Let's measures the difference between men and women or what we will do is not measure that What will Luis will measure that East there was no difference between men and women What should this table have look like And then will compare the difference between these counts and that table But that's the interesting part of a statistical statistic Which will do is call the Chi Square test It's coming up in the next residency but that's the prediction part or the inference part of this destruction This is just the destruction even though similar thing here This for example is for marital status and product What product to use are you not really depending with a A partner So what is natural fried crisp are made has rude age away with the correlated should use one as opposed to the other Okay you can use counts as well if you see instead instead of instead of doing it this way instead of seeing it is a table If you want to see it as a plot we will ask for counts So there are things that count plots and bar plots which allow you to do counts in the lab You'll do from really a few more of these This is simply another visual isation of the same thing For those of you like things that river tables in Excel series Microsoft has made you know wonders of us all in corporate life There were two I was told that you know you can have another master's in bachelor the masters in anything Engineering is good at sex and it's nice if you if if you have European she's in a few areas But what you really need is a P Jin PowerPoint engineering I mean that's a necessary qualification for success Certain tools have been used so therefore those tools have been implemented in many of these Softwares as well This is the river table version of the same data set This the last not last but still this is a This is a plot Let me show you this plot and then we'll end will take a break This is a plot that is a very popular plot because it is a very lazy flood This plot requires extremely little thinking lot of a deed of him You okay Want the very bizarre You're telling it nothing about the plots You're simply saying Figure out a way to plot them pair by pair and he does it So for example how would you re in this plot on this sites we create the matrix The rules are a variable and the columns are variables What is this This is H versus the age Each forces h makes no sense So what it plots there is Instagram of age doesn't like the gap nature and moans of a Q I suppose Python does as well So on assure a lot his age was You're right It should have been a 45 degree lives 45 uses graphic particularly the same 40 50 line shows up in all the diagonals So to make a more interesting graphic it plus the history this Canalys ISS This kind of analysis sometimes has a name associated with it The name is uni variant uni vary it means and looking at it variable by variable one variable a time Williams looking at age only looking at age University analysis is unique as in uniform same form uni cycle cycle with one Well things are Unitarian unit but for the set of data also gave the same later in replicant the same It would replicate the same nature of the data They always programme here against Well so yes So what it will do is remember that this graph the nature of the graph So let's let's see this So where is gender here Where is gender Here is is gender is gender In my data it is So when I did bass plot my data what did he do with gender Yes Remember in in for when we did in for here Remember how it it's stored the data Not any in here So it had product gender and mater status It had identified as objects in the dead of it where he could form the data free soon I What does it tell you about the about the command The pace Law Command Yes even if we know those objects So in answer to your questions if the data frame has been stored has been captured with indigent 60 40 basically indigenous or numeric seen it It'll plot if his oil objects in Probably given all planned is known why like that This is the Instagram This is the same block This plot is the same as it's flood This one is the same as this one here No there is not age was his age This is just age age verses Age would have been a 45 degree lives but is not plotting That is for flooding that in the diagonal it is not plotting age buses age in the diagonal It is simply plotting ages Own distribution Yes with the council What it is doing is it is essentially running his each all each observation and put it on the 20 from each bill is account Food is a count of the number of people who are in that age group Here this is engine of this Miles this is H This is age so seriously between here between let's say 40.5 and 43.5 Whatever these numbers are there are three people It remember the history Graham is a view of thing You can get a minor instagram if you want to which means you can you can find out what those are and you concede incited inside his programme Just ask for somebody that it will give you what the features are off that his Children But the history um is on meant to be used that way It's meant to be used as er as an optical device to see the shape to see the count It's an art to do his job If you change the bins the gamble locally different Sybil suggest that unless you've got a lot of experience in this or you really enjoy the programming do not feel with the programme It's ship will change our school in later after the break not changes to remember what shape is not in not in default You can go in and change it on size but the bill weights etc Truck been wit of instagram takes a little motor change You can you can find other things in which you can play this So there are ways to do it Okay quickly ending We're losing our food So these different floods and will continue after the break The rest of it is simply and expresses Why so for example this is each versus education This is the age versus education from the Yes he is right If this is education in the UAE access and age on the X axis of vice versa These two plus one in two and two and one are just mirror images of each other The way you look for the thing I remember when I was when I was a kid Mirrors would confuse me So I would ask the question like this when I see a mirror left and night gets switched but top and bottom don't I am in Understood Why Due to gravity you can think left and gets switched the top and bottom know naturalism into the mirror And then I called assembly Do my eyes you know maybe was a look at it this way That didn't help So it's an important point when you do Symmetry is a good catch The good cash realise that there are so many plots We're actually only half as many plots because the plot on decided his programmes in the plot on the opposite side of his programmes are the same There is another question that one a few assets that many of these seem to look like rules and columns in the sense that what are these Rose What is this role Look What is this mean It means that this variable fitness this very well fitness actually has very few numbers in it It has a number 1234 and five No Why's that Because remember how I define fitness is my perception of whether it was fit or not In my original definition of the variables Here you go self raided and fitness And 125 Silver one is in poor ship and five is in excellent ship This was the created data So in this data set I now have this variable in it These kinds of variable sometimes cause difficulty in the sense that they are some There's a word for it These are sometimes called orginal variables So sometimes data is looked at sort of you know numerical and categorical Categorical is some cam call nominal An ordinance nominal means is a name name of a personal not southeastern West gender male Female place etcetera is a variable Essentially it's a mim order analysts It's also categorical but there is a sense of order This is order dissatisfied very dissatisfied So there's an order order therefore ordinance this we're able to fitness variable can if you wish be treated as an orginal categorical variable So for example delicate skill is acts to the seven point scale Not satisfied very dissatisfied dissatisfied morally dissatisfied neutral morally satisfied satisfied very satisfied Mark one This unit's the data from a scale of say 127 or 0 to 6 So will show up in your databases and number like for example here you can see instead a 125 Very Ulf it moderately unfit Okay Relatively fit Very effect is still giving 125 Give it that way And you quoted up this way your choice So sometimes many of data that looks for us the date of the tight earn or any database will recognise it as a number because you've entered it has a number but you analyse it as if it is a category So the opposite problem also sometimes exists in that sometimes you get to see a categorical variable show up as a number but you know it's the categorical variable A Ziff court is an example A Zip court shows up as a number but it's obviously not You can't add up zip courts You take two places in Bangalore and you want to find a place between them That's not the average of this Of course it may be close but you can't do arrest meeting with ZIP codes The other difficultly with ZIP courts is that there can be many of them which means that as your data said grows the number of ZIP codes also grows So the number of values that are variable can take grows with the data And this sometimes causes a difficulty Because what happens is that in the statement of the definition of the variable you now cannot state how many categories there will be present So you know that there will be more Ziff courts coming You just don't know how many more of course will be coming But you also knows the categorical variables you can treated like a number And so there is on special types of problems like Zip courts that requires special types of solutions So the plot itself is a very very computational plot If it recognises it as a number in plots it if you don't want make it floods the number changes to a canister most of ways including pipes and will allow you to do that Whatever news we have been hearing 88 tell air our ground attacks does like targeted missiles major it of all these or in all of all these applications some algorithms like say which we collars like city enforcement Lonnie Okay these algorithms are implemented And when you actually get into the enforcement learning there is good amount of uneasiness It is very easy to talk about it like say how it works on all these things Political Actually we can draw very nice diagrams on the board But when it comes to implementation okay are even like say even think about implementation Okay How to create data is one of the toughest us Actually when I entered into reinforcement learning I did not not is different difficulties Actually we shared some of the material Today's presentation PPT a pdf form and also there is one in a right of okay the right of actual I implemented some time back and I'm actually trying to go for patent of that of course I revealed to the extent that my idea is not completely my idea cannot be completely copied So the 10 I say as of now you have Escalon library for Mission learning Okay Only whole except answer from recently came up like session to pay us back for deep learning And when we actually started programming for the this reinforcement learning there was no library per se We had actually write our own of framework to or complete the toss for just one obligation It took around a nine months for me to complete that end Okay so I'll be actually touch basing this one tomorrow second of because I will not be able to implement And so he because that is bigger application But I will be working you through like different stages If we take a real time scenario how do you actually model and then get to that level where you can implement And before that we actually implement some type problems In fact it is actually from through three years old I will be actually touched basing that framework and then will get into the Hanson Okay there is something called open a gym United the framework And of course there Alexis three for more frameworks Water mentioned in Cora But they are not maintained well and using existing framework for open like the reinforcement Learning is a challenging reason Is the data what I use verses The data What use here The data and I want to excel a gate Anterselva drinker and you want to basically hit some target so the environments are completely different So coming up with a generic environment where everyone can use that library is very difficult OK as of now this may actually looked little bit of jargon I will explain and then it will be clear Okay so the agenda is as follows We start with introduction to reinforcement Lonnie And then no First of all why should be a killing it into this one There should be some more business uh value Right So we'll likely see patent pretence in or different technologies He actually has already good amount of surprising learning and surprise Aloni Deep learning Not today We held our enforcement learning and this is like as of no one of the latest trending things in the market deploying followed by a reinforcement learning how many of you know deepmind in Google's different right So people are already we're about it And Google's depend alphago actually defeated um this like say old champion go game rate And it was nice I think 100 0 No So that means like and that was one of the toughest against it seems to actually I don't have any intention about that game So that was one of the toughest against It seems to beat human So people actually came up it The combining deep learning and reinforcement learning are now they are able to basically beat world champions in the area Okay And this happened just recently right So we'll see Like Say what General What is the trend in paid LF pretence When When it comes to these the technologies even expect less The surprising learning will be leading The patents will see those surprises Okay And then what are the differences between surprising learning unsupervised learning and reinforcement learning on few examples like seawater dominant or problems be solved using our L and then we get into introduction to oral And what are the important concepts The important of components in followed by one of the celebrated algorithms in of r r L Killarney Okay Well if you ask me like why that name Q I actually do not have Don't have idea but probably the guy who invented he Hassan Russian roots looks like and then or you have like a couple of yesterday so smart Taxi And then I do that smart taxi in class easing this tremor quarter the orphan mentioned open a regime and then he will be actually solving the next one frozen like four by four That is the service He wanted that for before And then we actually start improving or to learning from there onwards And then we get in the second case elected the same yesterday extension And finally we go to the next level of few learning improvement of you learning to the I said you learning slowly Improvement second level improvement The name looks like say some more person embodied actually state action reward state actions Okay And finally will be touch missing on North this real world Doc yesterday this document is already shared with you So would know what we expect up by the end of this model I mean this four hours and next for us will be able to actually programme to some extent real world scenarios Okay today self and said like seven total this framework holiest Ali small problem lexie programming environment And then okay said that code as well So that without easing the built in environment this Open a framework Okay How do you solve the problem I believed that will be more beneficial But I cannot give that one in the classroom because there will be like say hundreds of lines of code But I can explain the structure of it goes and then we can describe about different scenarios in finally examples Okay Explain of various models or in our That's what we can expect from this course from this model Okay And of course having said all these things you can see like say the top notch applications water we are actually seeing in the market No most of them are programmed using some form of Farrell Okay so there will be good amount of complexity when you actually go for the bigger yesterday's will Trade actually keep it as simple as possible from class under towards the end will actually make it little more complex and more realistic yesterday Okay so when it comes to pay such actually just a quickly blows our troops or the patents for example came in is is one of the um trending words among a data scientist came in and actually look at the number of tents filed easing came in such award There were around 6 20 not file patents awarded on with the deep learning There are like a around 100 K patents deploying being one of the oldest to technical exit actually started in 1960 70 straight So and then reinforcement learning which is like one of the latest ones you can call that has like 65,000 odd And if it's the the latest one I mean lost 23 years The amount of applications people are developing on going for patents or dominant in our l combining or a land deal these two areas of dominant in generating values And also if you see the takeovers in start of people focusing on this too deep learning R R l okay this tour like said dominant when it comes to start up If I have showcases some rally with R L We have hired tens of maybe taking over by Google recently Like some nine months back It happened in Bangalore Red one not actually Team itself was acquired by Google It was not even like system has established the start up Okay they showcased some 1,000,000 deal and it was taken over by Google on some undisclosed money The Han They did not disclose how much money they have paid Let me actually just to give you that confidence What if the trend in the latest one two things 40 can use alliance Not Orjan the Google Pitons if it is like the open so expectant Britain's information maintenance that so if I actually do this fun Lancelot orgy So I go for like say in this in first over the trend is this year just we have We are just in March that sweetie it actually came down But if this is the growth receiving because an exponential maybe exponential of exponential and know if it actually locate this is some are smooth in the recent past December smoother compared to the previous one Great So the growth in rallies pretty dominant computer deal in this in first because everyone is going for automation So if you want to generate value I think this could be one of the like the next if you can take it And now you can actually look at the other draw supervision learning algorithm saw in the similar session You can actually go to Google Pay attention see and you can see like in these things what under all other things like say for example monitoring method and monitoring device of deploying processor And you can actually see exactly what people are doing in this in past Besides Aly said some for task like serious be hard to get into this field Basically right The tide introduce technology the wrists of something like say trend Is your friend trendy Strader strained If some market is going up or going down that is the best thing where they can make money If market is fluctuating around a point the explicit of the system and go for brick So if they are able to see a strong trend in one of these technologies are some of these technologies Letters actually write the trend So in surprising learning we have like X and y excess our independent said And why is our target variable And so we want to basically predict why easing X predict by using accelerations We want to find a function function is nothing but the relationship between X and Y That's what we want to basically do in surprise a learning So some of the examples like So you have classification at aggression then object deductions Of course Again falling is either in norm classification are out aggression prominently Classifications Okay um it's captioning Know combining multiple in the technology like say if you have your deal pressed the n lt caption like major captioning Look at this What we learnt so far announced unsupervised learning similar stuff But without way there is no target So Justice Reese feature space We basically want to find hidden patterns in the data is in this morning So lonesome underlying head in presence in the given data That's what we tried to look in using on super the learning techniques So some of the examples clustering k means our Haider cheol R D V scan Okay some of the dimensionality reduction techniques places PCs affect analysis and then no feature feature Engineering of course featured injuring Also you can use like PC of our feature engineering So and then no or density estimation Easing like device can sort of thing or any other statistical methods so that sport predominantly likes a bird's eye view of super answer pricing learning we head and when it comes to deal if a problem is not falling under in our surprised around supers in learning okay you're not basically able to get some more data of friend which can be easily handled Okay which is next Combinations of President answer poet Say for example I want to hit a target Okay that is like say a very ambiguous statement Because when I actually take off due to weather conditions my direction will be changed My instrument direction Maybe that award is a European water It is that direction will be changed due to some external factors But I had to hit the target That means when I actually trained the model the coordinates are likes If the information what system had was different compared to know when it is in the air That means if the flight should be able to take the decisions on its own looking at the external factors but finally reached the target If you actually listen to some of these various by or days a r IAAF personnel we fix the coordinate and we sent it and it did the rest of the world That's what they told Look at it Targeted sixth And like say missile is fired like incorporating all external factors Indignant Mr Target the other day actually mentioned was humid us single digit few metres Therefore the mentioned okay that means the problems which actually how to incorporate that latest external factors Those of the problems where are all can be very good Your regular so parade and super they're learning will not be so effective in the scenarios rate So more general then Sir President and surprise Aloni Okay this problems like the reinforcement learning problems will be actually hearing this the following tunnel again and again There will be something called agent There will be something called environment There will be something called reward Okay Environment is like sea of our external factors isn't is European Okay action go straight or takes light Left are slight rate Okay Based on wind or whatever it is Okay So infrequent interest There will be communication infrequent in traverse It actually starts calculating with what Action to take left right site are good if rate So you have like safer environment Weather Okay Your flight 100 isil It takes action when it takes actions in the given environment that flight coordinates are different Now You look at the target on billed under Know your corners are different That Mr Data has changed Your state has changed UN mentors Agent Agent State had changed know based on the state again how to basically take the next action ahead makes in ST So this act this reputation will be continuous continuous till we reach the target Now there is this one How we actually save my action is correct or not Water action I have taken is correct or not There will be some sort of reward associated to that Some reward associated to the say for example is actually moved in the positive direction Whatever is expected Okay I'll give you one point Okay Do to like say someone tried to intercept you So yeah I selected the Alexis slightly deviated direction Noel give you minus one point as reward I penalised you that miss Know what are the action He had taken that into the engine of straight You have taken right That action is not a correct one To reach the target to reach the target here To go straight right now is is electric going straight are going left going rate those of the actions were taking Those actions are actually leading it to some reward Rewarding the senseless Yes you're closer to the target or not If you're not closer to the target because of our action you're going away from the target That means the action is taken is around action It should be finalised lessons So that's what On Alexa boards of you what reinforcement learning is to solve this problem Know people actually came up with a lot of algorithms so prominently R l is the science of different making And this is there like say people are actually running behind it Automated decision making for Violet is not sitting in this one Your system is making the decisions and it is not missing the target That to the power supervision learning If there are outliers Al Qaeda will go for us near human service Reason offers to capture the rate bill That is slightly a challenging of question to answer Okay the data itself we have to create Okay So you some examples after that then I will take up that question again Okay So how do I collect data and what What is the volume of the Reiter needed to solve this problem That is for his question Is the example Okay Have two examples No one other simplest thing One of the simplest examples like say we actually or live with Levitan and innocence likes if we actually have a lot of animals around us on many of those animals are our friends say for example like many people have dog as their pit Okay say for example I have cattle at my village home Okay if my sister calls that by fellow council if I call that actually comes to hit me And it is very difficult Like set to understand Know Like say hope The buffalo understands human language is very difficult rate But that's what exactly is happening now What is the logic behind it No Let us take a simple example Okay Let us actually try to teach something to the dog to do Okay So consider a scenario where you want to date something for a doctor Do say for example what normally this trainers do They actually tried to create some noise are clear Some action hand signs Okay Based on that hand side if doctors the favourable action What the trainer wants dog will be officiated by either a biscuit or a so then next time when China actually does the same Likes a hand sign our releases some nice now Dad Like say it tries to do something else And it is not getting the biscuit so again it actually goes Okay I did that I did previous action or on when I actually got the biscuit Let me try to do that one Get biscuit trust actually Repeat the world action Very Inter actually got the biscuit Okay now it got the biscuit So now it has the memory When no there is hand sign of like lifting the hand If I do this one will get biscuit getting this Katie's reward Not getting biscuit is punishment Action Yeah So awful Actions can be lock It can be continuous place as well When you take the this flight scenario it can actually take the rotation in continuous manner Why only three Siegrist It can be actually one degree by 100 as well 800 or fraction as well So yes you can have lot of actions but in this particular one Okay Say for example You lift the hand you're a trainer Relieved the hand Expect Doc to get up Okay Actually have put down your hand we expect Dr Sit down There are only two accents One day's exit dog has to get up It has to sit down and say for example charred action When you actually throw some ball from your hand he expect Dr Jump production after when it actually jumps He officiated with biscuit If it doesn't jump you don't appreciate it So you are increasing No action space are now when you actually take the stick it no slice they are going to hit are they are going to punish it So that is negative thing for that So when you take the stick it will be very obedient When er it was evident when you took the stick it did not get that much punishment That's pretty actually starts acting as obedient mix ins Know all these things those they look qualitative but they are all quantitative We can actually make all these things as observable entities current getting into the dog Example letter actually trying to quantify quantifying lessons Let us try to get some sort of logic What exactly is happening How reinforcement learning works in broader sense For example your dog is agent okay That is responding to the or that is exposed to the environment Okay The environment could be your house Are your lawn No Doucet chases the encounter or another Augusta State You are in the environment and you're standing That is one state standing with our folded the hands folded the over to the hands and kept it in the back of your body That's four status Okay When you do this fund dock girls actually comes and stands in front of you If you observe trainer actually rises his hand It is a fleeting saying that change in the environment change in the new government State changed Yes As a greedy person Okay When I actually tried to programme I want actually capture all the movements when I lift the hand in the all these things are important are like keeping my hand in the back of my body and rising my hand and sewing this one Okay There are only two actions Discreet actions for a discreet states this tour enough or I won't actually capture everything as a greedy person As a greedy data scientist I want to capture everything But for example this is one state and this is on the state Okay so my person is my state changes The agent actually starts acting so or agent we expect for forming an action to the change in the state exchange in the state in the environment Trainer actually gave the signal south of the hand signal Whatever it is that if the change in the state okay then your agent the dog is performing some action on the taxes based on that Like say he expected the dark to get up It got up on the other trainer eager the biscuit Okay you've giving the biscuit and dock Taking the biscuit is likely dog receiving The reward isn't receiving the revolt so far Every state followed by action There is a reward whether the provided zero minus one place phone getting punishment getting food note like say you don't actually do any reward It'll not giving any reward also is reward zero So after the transition they may receive a reward or finality Once it actually takes the biscuit youth rather biscuit and nor docks takes the biscuit Okay now again he Hollis a change in the environment The change in the environment is new state So from previous state from figure state I actually now come a came to the another state It could be my pre vested where I stood keeping my hands in my back Or it could be like say lifting my left friend So the states will be changing continuously based on the state There will be actions based on the actions there will be re warts reverts to the agent actions by agent environment because of those actions and rewards environment changes environment changes innocence The states in the environment changes Great So it is a continuous process Still you expect the Asian to complete the toss What you wanted Mixon straight know if asi it looked at tie example but it actually for factory Maxine's Now we can generalise this topic and insect We are all alike say tune to do reinforcement running in our life But we don't programme percent reinforcement Learning with hand I mean computer programming we might not have not done we might not have not done but we all are tuned to do this activity Okay The policy This is under term Dr which you will be hearing again and again The policy in the enforcement learning is the following the policy the strategy of choosing an action based on the state Okay policies Susan by Asian And then if this policy's optimal policy at the best policy that policy is referred as optimal policy for a given state This is the best action Okay Forgiven state This is the best action One of the simplest example says you're on a highway and no vehicles are in our inside and they are like say are driving looking at that scenario most of the time actually increasing Withdraw Like say you accelerate of speed the desert thing Like what we have actually experienced in our life No one says as like save Okay No we go by like 1 20 are 11 50 or system is tuned to the socket This is the environment Let me actually take this action We can decide this particular action led me into a lot of benefit in the first Let me take that action for this like state That spot is your policy action State action pair Easier policy The State Action Fair which is the best state action film I mean best action for a given state is optimal policy Okay on once you train your system this is the beauty Once you train your system at the end what he actually store what is a automated system is list of all eligible actions and their respective released of all eligible states followed by elections It it could be like the one million states it could be one million states and one billion likes a number of actions could be just 100 but the 100 actions are linked to for its death Say for example important series Minister That's the deal comes into picture Okay when you have a continuous phase Howdy Approximate Okay will touch with the point tomorrow Assumed that we have all states listed possible All possible states are listed Okay so So let me actually say this is all States state one till State 100 Okay these under all states I can actually think of or less a so casing in front of my agent which is Doc example would have like Warren saying is taken our state Exactly Okay So if some state is not experienced in the system when you train okay when in real life and your actual experience you're getting negative reward and the name of the Al Gore misery in force Okay this is the new state And when I actually do some action on top of this new steps I'm actually getting the negative reward Let me not repeat the same action what I did for the new state because my experience is bad But first um and actually do that action for the new state I am getting affected I had to live with that There is no scope Okay And another thing what do you actually can get the or like Similarity is the new state whether it is close to one of these 100 states maybe disclose to ST 50 This is New State It is close to ST 50 Let me take the top Come Election you're trusting is coming in the picture Okay And in fact this is what I'd actually implemented in finance because financed you cannot You cannot see all the states so approximate states and then based under to basically take the action So I have like sisters 1 200 No for state 1 200 Let Mrs for the dog hand lift This is that comes on the state state One dog jumped This is one action and first aid to publish sits on erection Okay And these are actions this need not be optimal actions Right So two once the train your system on the period of time Okay for the state one probably working to next point is the best action So over the period of time the actually start repeating these actions alone your system and then come up with a list of for optimal actions not list of afternoon action each state mapped with one optimal action Now the output of reinforcement learning Once you train your system you have your problem We have a problem on the train using our l Okay now he said like how your output the output is going to be just like say of C S F L With two columns with two columns Those columns are nothing worth states after elections that's it And this will be around put from the system the better optimal action cigarette the better system it will be state It is only one like one state can have only one of them Elections So January on highway whether air nearby Bangalore are nearby Hyderabad if like say highways clear And it is a six lane the actual accelerate that is different state But actual racing is common action Whether it is Bengal nearby Bangalore nearby Hyderabad Red So different states but same action because you're the states are very similar to each other excluding the geographic location So action and needs to be changed So this has changed in religious change Something has happened Optimal action needs to be changed The more the Mission Muhammad state are you would need to in the afternoon in production How do we use it No I can edit this state Become a similar to each other is day both I have come election but your state of states are different There were a lot of states you have a lot of states on Each state is associated with an after um election rate Say for example your lion on other for like for like animal like they are very similar But the actions actually for form when you are in the environment are different Yeah exactly Because when you actually to the same action when carries around assessment in our line is our own The word is different So the spot know in the production how we actually do it So we have let me say this video streaming Okay we have your video streaming on ice A sitting that video Okay No every one second or every 500 milliseconds Yeah actually captured the scenario He actually captured the state Okay continuous stream We are discussing it Okay on your discrediting to the extent that your system can actually react right So new state this new state now we actually see like map to your existing one where that state actually belongs to take that off term election action Okay And since you for found this action say you're on highways only for formed action of like say accelerating metal more We we are actually reaching the end of the coast End of the curve in the road No you're stated changing So after that you're getting again New state nor sins likes In few metres of few 100 metres ahead there is a cow the optimal action when Ari how this scenario is slowing down with So look it again This one Okay take the action The faster we make the system the better it is under Say when you look at self driving cars they say if you go beyond 60 kilometres my scar will not be able to function properly Changing so one action having the result rewarded are depends on the few just state that is our new tourist Don't know Therefore Q learning for instance opponent going district and starting from there this every time I have there will be no revolved for this is left or right where I should both if I take night than it is like I should in I completely agree on this scenario You always have in chest say for example any game game winning is the reward Okay to win the game trap the trapping temporal It looks like negative one rate So yes long term win in the target and then short term loss is negative reward Okay How to keep this know Marie That's very have like secu learning advanced on Sasha Okay This is actually part of the training of the reinforcement learning system Okay when he use it he already trained on the actually thought that Okay at 9 30 when you come out of the house okay irrespective of what reward will get the actually style should go straight Though you're visiting and work it is and you're behind because your office is 10 kilometres away from you Initially a achillesheel like oh they are taking some long roots And now he start Achilles suggesting that we cannot take This is sort cut routes after 23 trails When you take the next calf automatically get suggested You don't need to tell the driver after that The reward The reward here is less fuel expense less fuel consumption on shorter time to reach the destination That is a reward for them System is getting trained in the back So long term target should be always kept as like say or in the view on the long term Like when I said reward it's not actually one reverts He correctly pointed out that rewards will be accumulated in the following way When when I actually look at Revert only in the next moment That's referred as greedy algorithm when I shall look at long term reward that refers keeping in the memory We're setting the trap on winning the game So the present reward present award actually will be reward starting all I say Say I acosta two Maybe I am actually going to have all excess of 100 times steps from now One words into the future Okay And then it is not exactly with the equal preparation my present award I need to give a little more higher rate than feature award rate So actually put some more are like lorne ing And when I go to the two time steps from now this learning parameters should get depreciated So learning parameters less than one and then allegedly tried to put something like say or so since learning parameter is less than one And when you actually get into two defenders of features that manners from much importance But we are strengthening of some importance Now you can play with these things and are so we actually get the memory into the system training So your actions are not instantaneous Your actions of looking well I say feature reward also into the actions are actually taking your future rewards also into the consideration mix ins This is what they were asking So how do we actually play with this one You actually have come across this point in our A boost as well as in grade in boosting learning parameters and in deep learning licence you have been living with it currently Now so far are be fine We are able to get like say the enforcement learning is nothing but like say looking at the state and taking the action And they are getting rewarded for the front And what The period of time What are the optimal actions for the given state Okay that's what we have to figure out on output of reinforcement Learning is going to be this just the table Okay On this missile it will be just this year Stable implemented So like system to work There will be various streaming coming in New state identified a pro Nash has taken It will be continuous How first you can react That is nothing But they are accuracy Some people say like say I can react in one microsecond two microseconds Then you can hit our get even even before the others realise it way the with on Like I I am looking at no self driving car in self driving cars And how the following in self drinker Have a camera in front of the Lexie car Concha Mira It is taking the pictures How frequently wanted a fixes Maybe every 100 milliseconds Livestreaming though we call it is live streaming for marketing purposes It's actually frames how many frames we want to have been given One second lettuces $100 frames frame one latest frame This frame is expending me my state rate This frame is nothing but state where I am in No on this train he hot associate with other quantities as well right the speed at which you're actually at present You're driving on other external factors probably or when actually write my scudi My son says Daddy Some cycle Walla is going faster than you normally cannot write faster But others like they may go at 1 21 1 50 on their bike of the his Each film can have only once that because this this frame using your clustering there saying this premise emitters also states a grievous me Can we have another and can have on the official proving car will have like a lot of letter to French Google spent several hundreds of man years to build this one self driving car at Google spent several hundreds of menus We actually simulate all these things and still be here like recently self driving car hitting human on the road because that particular scenario was not captured earlier So you have the frame and also associated the attributes All these things is new State This new state is one of the states in your C S V file Then take action Okay Once you take this action it is leading to the next frame That's it is an iterative process Training will never take place in a client place Training will never take place in like Okay The person who sold you that self driving instrument they have trained with enough for scenarios and they have given you And if they want to train with the new scenarios that what they say we are giving you updates We are giving you updates They are adding more scenarios It will be like say very limited data in it you know like say What is the scenario in front of you Are there any objects If it'll there are any objects How For the objectives from your present location You can detect the one right when flight is actually going in Air is able to tell I say how Like what It The altitude it is of flying it is able to detect it actually tell See Like says it is coordinates rates You basically from your flight is in the signal and like say that the sound signal Okay How fast it actually goes on Council back That is Al Gore is um basically to detect the saw distance And they're so flights like the pilots are able to to like say we are actually cruising 11,000 or kilometres System action should get all these things Otherwise what happens is the date of what we have is not fully or like say defending the state The people agree with me or not less is yes We thought sale I say how for ahead of for some particular objectives from me if I am not able to detect that one I actually end of it An accident at some time This is it differently in allies in a useful letter to help as an attribute And we have been saying like This is what is available We just have to capture and use it Once he had all this one we're doing actions That action is leading to you to the next eight that next year again includes frame and attributes abuses more discipline into other models How we will new data and creating belief Is it deployed Exactly It is the same But if you actually look at it there is a significant difference between your surprising learning on special learning on the enforcement Learning the target in nervous surprise A learning target is sixth Sure your target is actually evolving I want to come from what is a Hyderabad to Bangalore My target is short term like say every 100 metres like see whether there is any up cycle Reach the 100 metres again evaluated Red target is evolving Um so there is a significant difference between your are a newer s l and you were like say on so far you're loaning Clustering is actually coming as part of the state aggregation the have rain A moral visit Mr Words from the ability Marla Snot Any news Radio industries But still it is our wedding Our classes where moral has not seen Yes Then also we have to give you know even in our nationally scenarios do you collect harmony Sets the same differently Train test and deployment has to be there Okay Tryin test here Like said test I mean to say you have a self driving car you put it in the ground and they actually try to create create hurdles and see how it reacts the test environment there So he create the data because this association think these things are being sold and very bad they all many Mississippi think that it is order order Not only are even machinery in this thing is I completely agree So here's the thing is when you actually explained to anyone this self driving car it is a strong or centres made red when Google actually and also these things Self driving cars are not suitable to Indian roads right Self rowing Cust They are not suitable to Indian roads Main reason is like say for example the lane discipline These things are violated very frequently exactly So they may be actually of always a roaming around freely when that sort of scenarios are not part of your training Later the new system doesn't know what to do He met with an accident in that case So the valley is something is not really It is a marketing term We are evolving The better you actually say Like minor is a very good system with a lot of conference If you are able to tell you can sell it for a few $1,000,000 Instance we talked about unlike some scenario which is President didn't exist but shouldn't be tricked on to some default state that therefore it exactly actually happens in your car and expected kind of scenario Exactly It actually comes Say for example you say like when you're driving your car without actually coming to the neutral are pressing the clutch Okay you put the brakes slight Break under our on fourth gear automatically stops the first it within ginger Stop Find that for all Indian areas for business For instance for your trouble in stock problem devil appeared reports state we have to define all these corner cases would be defined Okay because that's where the building of Alexa building reinforcement learning system takes a lot of time It looks very very easy to explain But development Now you see the complexity we are held on about lot of for like the new data points coming in all these things now coming up with a general framework to solve this problem is far from reality No let me go to the next one This is like say I think almost all of us might have experienced How do we actually teach baby to work This is exactly yes tried to do some reward The reward could be just clapping It's our attention seeking people right in general So when care actually gets up she doesn't know whether that is good Activity are bad activity We all start clapping on his his like says smiling faces all around So you get this action this action when I do I am able to see like a lot of attention towards me Let me do it repeatedly And then he actually puts one foot forward He falls We all will be said so that action of falling is not actual like by my people Let me not do that activity I Medusa is something slightly different because by then I don't know like say what is correct One I may do something slightly different and then people like people are liking So I tried to repeat it again and again Whether the kid actually looks at all smiley faces are like he actually look set or chocolate or whatever it is from his point of view Smiley faces are chocolate Everything his reward v actually saying he fell That's also reward negative Buy them kid And you don't have communication mechanism developed So for example my son actually started talking only at the end of third year and we were sending him school to school at starting of third year Okay His teacher is issued to tell I say he doesn't speak to anyone very bringing but I could actually understand whatever he was trying to tell That communication mechanism was established between me and mason but not between teacher and student So exactly those actions Whatever he was actually making those actions my system was able to understand it He had to train the system for the new data points makes sense right And No this is what the 10 storeys have a chilled anyhow Surface So comparative child actually puts and then they are actually rewarding him With shuttle it he falls You don't rewarding with chocolate So winner that could want circular know he put those two steps Well it is actually much more complex than this one Online learning is marketing term against Okay We actually tried to say my car actually goes at 300 kilometres of our our speed Cherie claims but in reality we actually buy more Itty going at 1 20 are one doctor Speed online is exactly the similar one So people claim online learning at to come in something like automated mission Learning Automation Morning What Some of this frame of people are claiming Microsoft of the world a mission of the world they say like so you don't eat evil programme You can just dragon drop Yes we can do all those things but at what cost So they are actually trying to simplify their is limit for simplification online Also it's basically for marketing in the in reality I managed believe online learning Of course there are all gardens for online learning Okay on reinforcement learning If you actually take it as like say optimal actions and then the states s actually taking actions online Maybe you can say my system is automated undertaking actions online or in real time Can we go ahead Are we getting the sense of likes of what kind of problems reinforcement learning can handle Okay the data says can be can be very big on the data Can be pretty ambiguous at this stage Okay Now you have like say he however agent the kids okay He did some action Place is standing up or watching of forward and then the environment where you're in un surface He observed it and he gave him reward And before that he had to interpret it This action is good You have interpreted You have given him reward And then he stressed again repeat the same thing when they interpreted and gave him reward Your state has changed Environment State had changed against it is continuous cycle Okay right So no benefit those the table examples By the way These examples when you actually start building to the next level any example are any case that you tried to solve are very similar to these two examples You take any example Okay Whether you're talking about self driving car or whether we are talking about like say hitting some target all our lives again boiling down to this month your data changes Okay What is my state That state definition will be different from for different problems But the process is Sim Okay No civil actually saying Like can I keep the or in target in view and then now have that likes a present Rivers like adjusted That's different algorithm To solve this one efficiently actually come up with a different Al Garza's Whether we want to be greedy I mean trying to focus only on the next reward Trying to focus only on know like say chocolate Ari wanted father to lift you and take you to the bazaar I walk as a kid and they don't take the chocolate I refused the chocolate That is a negative thing I don't take the reward And then my father thinks that okay he is not taking even this expensive reward publishes expected something more The short all kids do right So Or Elise actual intersection of some of the known areas One is like save after Mom Control This is what they say Communication engineers They are very fond of this particular word Optimal control Okay guys they actually focus on optimal control and R l from optimal control Point of view our life saver air space applications or communication at work applications Okay Underneath galaxy operations research right operation is such it is pretty interesting problem Say for example we have this start a big blood bank in the city under ally Say yes blood supplies there And there is a requirement also from different hospitals Some played groups are universal donor Some black groups are emotional except ear's Some are only 1 to 1 with given supply and given demand What is the optimal way so that everyone can actually get the everyone can be solved their second Arul Okay you can actually frame it as part of the enforcement learning problem Okay but it is a supply chain solving a supplication problem using artificial intelligence Now I actually told only till this point now we had actually get intuition list What could be my state And what could be my action for this if universal donor is actually giving into universal Except er probably that may be a right correct writing universal donor giving you test stick up Group two Public could be writing so he actually trade through all these actions and finally come up with proper match Look at this Supplies today whatever bled we have in the bank could be actually up to so on So Mixon sit so underneath Aleksei this economics people it'll it'll Terry game Terry whatever it is and then medical applications Neuroscience are like this one now or near like artificial intelligence are deploying And since people are talking about places state aggregation are some time back Me and my friend in IIT Mumbai We were actually trying to solve some problem empirically Okay We actually spent a few weeks and we actually came up with the state aggregation He have continuous data coming in How we actually get the state's prepared because you cannot take data in a continuous form It maybe every 100 milliseconds are a very one second how to consulted all that information and then go for the action making Right So reaction empirical approved And then But we have some doubt about its validity it after a process called worker By the way Barker is one of the leading research scientist in India but lacks a reinforcement Learning is an IDB okay And his students are in Iasi Okay that's a process goes and other people and other like see his Condon burqas Cannon is pretty big in Bangalore Ezzell So we I shall explain him the problem at all Ok Gorski my paper in 1995 I told this one politically Okay Almost like some 15 years before what we actually solved Empirically he could political prove it state aggregation How to do in a continuous man of how to actually agree Get this information and put it is a state So that is nothing but laces Some people call it victory Isation Some people call it is clustering on in our oil terminal services state aggregation continuous data You have to put it into life See all that information consolidators likes every 100 milliseconds This is what state is makes sense right And then so how this particular beautiful object reinforcement learning is seen from different people Okay when you are like say coming from competitions and machine learning These are the intersections 40 help Okay So different people tried to handle this problem differently But at the end automation is the target and the and automation is a target And the other thing is there is no target bearable here It cannot be fit into Our suppliers are learning It cannot be fitted to surprise a learning that those sort of problems automating those problems solving those problems Solution Automating it is nothing but reinforcement learning it is core of decision making And if we want very very precision decision making that can be achieved to the greater extent with oil Okay so you have like not trying to see how the interaction between agent under any arraignment Okay So for example have you any arraignment in that environment E how states rewards actions right We actually um it state and reward on this man isn't he Actually sent back action because of that Something will change here again He actually start a meeting Okay this is what is exactly happening in ever Brazil driving car or any continuous systems This is what I wrote here This is a continuous process Okay Williams this is like say this is actually emitted Look at the lookup table dot C s vegetable action is adolescent And again if our state is coming up again you're sending action Looking at the table Since you're just looking at lookup table action taking time is just fraction of for second couple of microseconds Can we go ahead So now in a vase reinforcement learning is the science of making optimal dish in right The fourth The crux of the storey is the science of are making organisation Looking at the experience what he had so far experiences nothing We're training the system right So breaking it down the process of the enforcement running up during the environment deciding how to act easing some strategy Observing the environment is nothing were state aggregation last 10 microseconds information Let me put it as my state rate in lost 10 microseconds My vehicles travelled it till like 120 kilometres Speed and coordinates are from here to there The coordinate of presentment coordinate of previous one is so on so under fuel consumption is so on So okay the spot My operation of the environment now acting accordingly You are optimal actions looking at the table under receiving the reward reaching the target right Under learning from the experiences if it is good experience you tried to repeat it If it is bad experiences right avoid doing the traction and then it until an optimal strategies for formed An optimal strategy is like figured out And when you do all these things were after all started in there And then that's 41 train system is like cases until and documents traffic actually wrote already hyper parameters learning parameter Here there is one of the hyper parameters we have to tune I did not name it because I have not introduced the term at but that make since right Can we go ahead So some of the components in reinforcement learning one is a reward Okay this reward will be coming again and again The world is nothing but a scale er quantity It could be like said dollar rupees dollar are rupees or it could be something like punishment right Yeah So and our target is maximising reward by the when no one will tell you what if the reward here when you are actually going ahead to build your system No one will tell What is the reward for your system Completely unknown Now we have to come up with their own reward definition Okay Probably same problem Whatever you're solving Okay you may have reward defined in one way I may have river defend in another way So though we solved the same problem in like say in Orvis you're set of optimal actions on my set of optimal actions Forgive in a set of states will be different Okay Probably to win the chess game I may put the trap and then Okay You don't like to actually lose those of initial happens that they don't put the property actually emitted lot more Step two in the game That's it So Target is going for like maximum reward Okay So some other standard examples for the reward So this is just allies a sample list when you are actually building any solution Given a problem he hunted a final award Say for example the dog receives a reward from a trainer for doing the activity What trainer expected Okay so probably that reward is on the lesson Getting biscuit are not getting anything Second word for it is getting meetings and then getting call from Maneka Gandhi So in the area of those things now like getting meetings the environment also has its limitations right It is not that it can actually do any actions Water it wants to so undone Say for example like this or auto auto driving helicopter Okay And this man like say Andre Angie from Mom Stern Fort His example is pretty famous Okay for this one I'll just play that Is he not First of all taking of the helicopter You see Like say Khomeini states are possible in this point and it has to balance And this is like he actually created this example on his own with his students still flying Okay He said the target are now it will reach so it will go on hit a land that is actually like serious of punishment rate So that action will not be for formed again and again It is just flying in the air there only now he said the target instead of like flying they're only going forward Has I reward So it starts reaching the destination It should not fly upside down Those actions will be finalist Andre Angie is actually standing next to it That video lice It is slightly bigger Exactly The state action after months state action pair is actually running inside the phone Or you thought that it is remote helicopter This helicopter is actual flying with reinforcement learning algorithm incorporating Okay And in fact you can still I say his photo Also next This is the men's you can actually see See him actually doing all these of experiments This is the men we are talking about Okay so that must be hard to change out for seven to some extent Okay Our Steve of Mohan case and I and ended their both the lab mitts in the the this Yeah So this one likes Is flight Manu us Okay noisy Some particulars are designed of like revert positive revert for doing the expected activity going forward ARF loyal isis are being in the air Pointy reward Negative Revert The actions the exact date of this Exactly So the word oil to sign is positive or negative Sign is penalty but we normally don't call it a penalty per service in negative reward Okay And then he still I said default or champion in or they say back home owner Chess Okay Initially set a trap Like until win or you have the following Finally What Ari Her life Winning the game are losing the game is the reward Okay Winning the game is posited Award Losing the game is negative No Minister and investment portfolio making money if you make $1 money also reward if you lose $1 money negative reward and then controlling a poor station Now it how like say if we're able to manage our efficiently alter word any security like say incidents negative reward Just giving an idea like how do we defend rewards Each problem will have different reports Different kind of reward definition Okay And then how a humanoid robot can walk How can you make in Ruben a humanoid robot work Exactly What are we did with the dog and baby Exactly Same like actions Repetition forward motion of humanised and I say a point on falling over negative reward okay And then playing different games different Atari games like say you have no systems playing games on their own right So that sport lakes exactly start playing among themselves They actually become like pretty intelligent And after some time we may actually find it very difficult for us to control That's what actually happened in alphago rations Distance in Understand We are defending it We are defending that It is No Okay Very good Okay How do we define reversing the helicopter case in Vegas Insist they should be Census Which witnesses in the distance close the distances to the ground Exactly He actually took the action now So for example tilting or the actual took the traction on because of the actions What is the like The altitude Know your own If the altitude falls it is an inter reward needs to for the altitude meets to that is expected Action You received the targets in Richard the expected the GPS Okay Now he reach the expected GPS so your altitude should start falling Exactly Based on the in an environment of state based on the new government of state your reward will differ This remote is nothing but some factors that multiplies into the weight age from a mission Learning a waiters are like a function is a function which actually a cumulus all even actions output the better actually capture all the output of actions the better it becomes Um in your system will be for better Makes sense right Can we go ahead So now you see like girls and actions under this is another term what he actually come across You need to have electric goals and actions Set of actions on the Russia short term goal and long term goal Okay sort um gold is coming from here on long term goal is like say yes I want to win the game How do we action Incorporate those things and then state we actually discussed a port ST quite a bit What is the state State is not a whatever environment perception environmental environment perception Maybe the image 40 Have your vehicle image Water We have taken along with all over vehicles Attributes make since it on this about status And now where the juice I'm in vary the main information in reinforcement learning the better he make the states on the better actions he make for the states I mean internal possible actions for general possible states If you can a kelly get all possible states forever the environment or for a problem Now you can actually train your model very well And this is where we actually get little bit greedy Okay How many states we can actually with maybe 1000 if he actually said 10,000 is not training them takes a lot of things Okay Unlike say one million states is gone now be hot Is election biggest systems to train break on Ministers project Li Like say now we are able to like all possible frames in the video right All possible frames in the video you're able to We're trying to see the policy This is what we discussed Policy is nothing Work given step action is associated optimal policies for a given state Octo man I like action is associated These are the terminals is one of the tunnel Is is we have looked at reward goals and actions State policy These are the standard terminal is what we have right No Finally so far we did not talk about model How it model looks like a model is nothing but Asians Perception of the more like environment And then this model includes the C S F and water with the vote okay And in that C S v file or like say if you actually go a little more deeper if we're able to predict what would be the next state as well Apart from action possible action the actual predict the next date as well If you're in this present state what is the formal deputy will be in the next eight Okay Now you see are getting into probability what is a possible to that will be in the next eight And that's what like we end up into Marco chains Mark audition process today I don't introduce it tomorrow I'll injuries In fact reinforcement learning is interested to MDP market Asian process are explained and then reinforcement learning comes But if I actually took over mark rotation process in the afternoon everybody will be sleeping okay And in fact for us to implement we don't need MDP We are practising data scientists and any of you want a life saver A political understanding of this one I am like so willing to discuss on the federal like off any want to take it up there is this man No this is the man he actually leads all for Project in different on this Alexis He actually started in the University College London the razzle or 10 Lexus Each lecture approximately one hour 40 minutes So are around 14 hours We'll definitely the stress testing They are actually giving different state which was not experienced earlier And now you're trying to say however system the acts Yes You're able to do that Yes it's option but not incorporate into your system If you give any new state Exactly that state need to be identified with something which is already there maybe using clustering rate So Lay said this man is one of the celebrities in our l Okay So like you have enough materials in his website in his cell website again I'm going back to my deputy So his name is David Silver and by the way in his class is actually very frequently advantageous Likes it deep Mindy's are not not like recruiting If you people are interested in you can a place on his down Tate personalisation when you actually listen to him he feel I say Well I have to do this message That sort of influence He actually he has in his voice Okay No the framework quarter You're mentioning Okay Yes we talked the vote licence political stuff What We want to basically solve problems And we want to basically wanted to help states defined actions defined on some way of life training the system correct rate and the framework for all these things OK the basic framework It's not some delays A psych it loan Actually psychic land spoils is we expect all the comfort in life because of psych It learn So for the sort of our El Al Gardens there are hardly any frameworks available And this is one of the Mitchell framework and it held likes a few examples through which you can actually developed the intuition on the fatal We want to deep dive We had to write our own frameworks Okay so this is open a gym Ikan actually Get into people still Jim and you can start using it places Document is not so great Like in cyclone Cyclone is pretty good documentation How do we solve the problem We talked over the problems but who were is over the problems unless we have mechanism to solve it So one of the old guard as people have proposed to solve is that reinforcement learning problems unlike the problems of this kind Okay people actually proposed retraining The system are relearning the system's the enforcement okay on one of the all goddesses Q Learning Q Learning is of policy It actually doesn't start with any policy Will see what is Q learning It doesn't start with any policy It doesn't have a model per se Okay the model it has if only like say he have this sort are Canada's neighbours Saw classification right in Kenya's neighbour Classification What in the model It's only Kenya's classification What is your dad is name that is There is no model person except K Costa Triarc A coastal Exactly And in Q learning it's model free He just had our CSP filed as an output He just held dot C S V Felder Sit So it is a model free of policy means it does not required to follow a specific policy while training Okay It actually looked at all states repeated times and January How like say for like good states it actually start accumulating positive revert back states negative reward Okay so that's what it does Under the end it actually help the following set of all states set up all actions as a date of him You can put it set up all states set up all actions You visit all the states for all the actions multiple times if you're his state One is say for example the visit action One state only for formed action One because of action One you may end up getting into state for so the cycle actually repeats So actually start repeating This is the like say information again and again at some some states actually converge that Mr of Suitable will not be this This results this Valley's will not be getting updated much the 40 solitaire convergence Now how do you use it in the production How do I actually say what is the best one for statement Whatever in the actions what are maximally I have say for example for state one I want to say index zero for state Want the best Well zero so venom in state one for from action for when I'm in ST Let me is indecision Lee when I'm in State Election index zero for form Action for index Want for form action Five under When I'm in like the state and export perform action Five No One getting states associated with out optimal actions The stable is repeatedly updated and at the end I'll get the stable and then just like say when I actually say what is the organics for each row that will give me my action Optimal action or Max is nothing but maximum happening at the particular place That index will be thrown out So Q Learning basically professed the stable And from that we basically prepared or optimal actions That's it And this is one of the simplest algorithms and I feel I say are really is more natural than any other supervision are learning So e think that charred body simple But for some other guy who actually has a lot of fancy thing with cost him eating car is very good Reason is they say looking at the photo on identifying is much better than understanding some more multiple languages Translating those language will be difficult and we have the habit of like getting into multiple language in the same sentence So how do you deal with the action Might be different if the future state is gonna be so for example as I am in a state index zero now I for formed action Norm Like said the optimal action in this one is indexed four and well what I'm saying is that the state can change because of my action district in all sections Yes If there is in the previous one of the base price he sexted there's also a possibility of predicting what the next step of probably some problem is now knowing that my action my national change is that another dimension to this A Q learning is the starting on top of this Oh no we actually do like say all fancy feast of so with rights Now we are explicating like just of steamed rice cooking Okay from here you can actually make a barbarian e you can I get a lot of recipes prepared on top of this one but this is the base with so at the end Now when you actually look at it this problem I say I think defining reinforcement learning is much easier than compared to any other supervisor and suppression learning it is very easy to actually see what is the problem And it is very easy to actually defend What is the world What is ST What is action What are you are saying Maybe you're the right person to actually to defence You're putting the condition there with so any persons so far It is fine right I actually give an example in general water princes when we shot training the system when it start training Any reinforcement learning algorithm He actually start with zeroes They are suitable starting with all our heroes Okay And from there and mercy actually start updating And the action among all the zeros among all those selects a Harry holic 123 Among these three actions random action will be taken Train of this the size that Yes of course Then only we can train the system if you don't know how many states are there If you don't know the state what is going to be there then water What's expected Actions to train the system will be difficult right The transit time may miss because 1,000,000 scenarios And while the system is running if we come across the announced eight basically that very have came is clustering to help confrontation This latest new state which was not experienced in the history is similar to so on So district example of helicopters when you are going up to five I should increase by the war State will be some people based on that action Now you reached the state After that it has to be reversed Right So how we have identified that in so few learning will not identify that Let me actually answer that one The state very are in on the top Say for example this is ST zero Okay When I mean when you are in ST Zero that your target is going straight and then coming down If that is your target on this devotee of state is no Okay The possible actions it could be Go forward Increase your altitude Decrease your altitude that this allies turn left Turn right Turn left Any of these things So say for example if you actually take if you actually say go forward in the same altitude this is one of their actions You're continuing And then here when you're here you know like so what is this GPS location As soon as I reached the GPS location by stated changing here it is not at zero It would be some more Yes 100 in 100 when I visit Like some more 100 kilometres are like say 50 kilometres from my destination Now I start receding my altitude That means not only going forward altitude receding and then going forward is the action otherwise like so you are actually going forward And this is like bad action Do not be able to take you will not be able to land So if I have to make a matrix buy back for this so you don't make metrics of that kind Working 40 actually says say for example totally 100 states the 100 states 1 200 one of the actions they say go state back down up for these are my actions I can actually says And when I am actually in the simulation everytime I know I repeatedly run this one The simulation whenever I am in like state zero statement suggested to when our I am here for from this action this is negative Some big number whenever I for from this action this is native big number I am increasingly altitude were no right for from this action a national appreciating it I do this activity again and again Minimum here And when I do this activity going forward I may end of its statement He also had to come a lesson for from the same activity to get better reward Yeah soon as I reached 100 I cannot do this This one I'll get NATO Revert I heard actually for from this one one of those actions will be for falling for a given a state But one of those actions will be the right one Others will be like said deviating other Sylvie wrong actions Mixon Sweet Is it clear or not No How do we have done that Few table Okay that suitable If I actually right it's slightly differently Let me put it here The uh question Please don't worry It may be distracting element I'll actually write it easily This is work you table whatever we had earlier That is nothing but do you of justice Let me put Yes that is nothing but table with tha states actions This is not the key off Zero State your action zero state on fifth action There are only Fi Alexis Six actions for example Later if there are 100 states k of 100 state direction Kay of understates on 50 Action Okay To start with we actually start with algae rose here Okay Intelligence with algae rose No How'd we update when I am in the state zero Unfair for For action Zero What is the reward I may get ₹1 as a reward Okay State zero Okay And like I say I'm for forming action Action Zero And that resultant is actually pushing me to ST three The abduction is going to be instead Three So ST three The operation No no I had actually repeated the because I have taken action Zero I got ₹1 The river has to be updated in the my present State ST Zero and action zero So starting with it it has already 00 Okay say for example may learning rate hyper parameter Let me take it Has one for convenience Okay And the reward is ₹1 This is a reward I got the discount battle as men take it has one for timing and this quantity I am looking at places in the next day because of my action I am leading into ST three in ST three What is the best award possible Starting with all heroes So all will be zero here Right on old Valley of your presents Zero So even latest Well Lou in Cuche Gerard zero state your action zero will be one And when you go for the next next situation okay again you are in now for like state zero and action zero your resignation will be different Actually will be actually leading to Ari following No So this will be repeated positive actions places access leading to positive reports are getting accumulated Negative rewards are getting accumulated but negative side whatever it may be Like many sensors which are detecting what is going on since the state is that house detective reward reward we have to define the world Reward is defined by happening How does that work It's interject So that is what again one of the ICC can generate What is the best way to defend a reward for my problem There is no define like that There is no designated way of saying okay This is how we how to defend for this problem out of people doing today like to the orders of most process that the capture action unleashed menace Inroads say for example let me see is the standard the problems where you can see quantitative reward When I buy a stock What is the DiVall A What is the return I'm getting by the end of the day is my reward for the attraction Sure there is formally for a Braun getting your destination in the reward My moving in the direction moving foreign foreign to direction to the destination keeps you oriented in isolation So you actually have to keep the movement on going Okay say for example for taking deviation The taking deviation will be a trap If you go okay again we are going to the first example only because that latest one is in or mine Whatever happened organise actually put a trap They were like a lot of flights Fighter jets That was the news Ray There were a lot of a lot of fighter jets around Bhopal and other places So in a big are trapped So what is our likes A loss We actually spent good amount of resource and field The witness within the target long term target in the view actually is a lost little bit in the starting So now we have to actually defend What is your reward My The world is reaching the destination hitting the enemy And that may have like a $1 million as a reward So learning can happen in two stages Here one is online or vendor single actually being trained When When Nora No actually that doesn't happen The online whatever you're saying actually is part of training the online water You're referring is actually training That's well I say it takes time to build the system on the better actually train the more robust cities We can isolate you What a gnomic ledger is possible to this baby Because lately have Alexis surprising learning as part of it And so with the learning as part of it love lot of ambiguity Okay And there is a lot in your place to define right So no the relation good amount of literature around the state this particular algorithm No like say some of you were You're a skinless hyper parameters and these are hyper It could be any floating number okay And whatever she was asking about like keeping track of the morning in the like competition Actually this is getting accumulated Keeping track of the learning explanations from me Okay at the end of the action look at the equation This actually comes out to be similar to this Thus this also like independent mint problem of government offices He had also all those like say bad guys will be coming in We have to live with it okay And we'll be introducing some of them This is what the abduction mechanism is Q table Ok now there are few parameters the whole morning it disco ineffective Okay when you actually look at the equation the max maximum of next state Okay for all the possible actions you're becoming greedy For all the actions in the next eight conceive for form action on yesterday 80 honesty You learn up in a step left next Eight What is the pie Alexis Best action in the one You're taking the maximum of the and then you're giving discount factor You don't want to be too greedy also but you want to follow that long term target rate and then world were livers is new value How much you want to basically take it up Right Old value This is old value And this is like a learning rate You may take Alfie Coast one when you take Alfie Quest one this value and this really gets cancelled Great And they are going up Going ahead with the latest are Artie place for that means your present Yes The values are replaced with Artie Place for 80 place for So you are becoming more and more wondrous A short term cited greedy We are not keeping track of history to a greater extent right So that we may not want that So he want little bit of learning happening The what is a history keeping with you So all fi close to close to one You're actually saying history doesn't matter to me all for close to zero History matters to me a lot Then present discount factor How much greedy want to become like say looking at like next movement I want actually win I never want to lose even ₹1 Then you're his con factories One training this model and getting the optimal numbers is going to be a nightmare when it comes the competition correct is not so straightforward It takes good amount of computing power Say for example one reinforcement learning convergence I mean one model convergence If it is taken I say one week for 100 parameter combinations Now we have to take cloud subscription And then and I say how in no independent manner train all those 100 enforcement learning models and see like how your system is for forming and that so actually decide the learning parameter He held something all great Set right in missile Ernie Okay So similar A pretty how to follow fear as well but of available library straightforward available average There is no one There is no library He hot right Your own rapper Okay And then the main steps in the rocket there are nine main steps Okay It is someday Like after actually come the second step for from thought like 3 to 9 that 40 And once we come to five to elicit fourth step for for 5 to 8 it's actually look for looks Start with initialise all states and actions Cute Able to zero Okay for all episodes In the sense Alexei Absurd one I actually in that episode one I tried to work through set of all possible states and actions again and again I have my queue table prepared Now I take that prepared acute table as my starting state And again I start next episode in the Given Officer I say I said my times to pass on the list 1000 I will be revisiting different states 1000 times Okay And then next episode against 1000 times like that I'll be action repeating and then finally make you table will be getting updated This is what the learning process in now of what is a Q learning and when he actually look at this queue learning So it is so simple Actually hard work is hidden here states And now actions preparation state action states preparation is hidden That's they're actually all out Blood will be squeezed Okay The upstairs preparation few learning is not everyone No different variants of Q learning Also people No no no Why People actually don't venture into reinforcement Learning straight away is there is no framework available alone Some delays baby problems Except I problems Water We are working the minute Carrie Machiavellian the industry unless is solved Case business case rate so many people don't venture into this reinforcement Learning there are like sparse groups in India One is Iasi competences and department The hall It's a couple of guys working there I'd be couple of West working there Okay The effort Some political people are this but not implementation level And that's it As far as a comic in sports are concerned in India people focusing on these areas Unlike say there are a lot of people who actually associate themselves to R L and severe our data signed this reason near life They are not practise NUS They're not political people their neighbours on neither side two without working with the data Implementation of our l doesn't make Sens great So you're not on the other side I mean not practising without getting intellect said that Mark addition Procession deriving AL guard of political said You don't have a leave That's what I mean Now let us start with a simple example More taxi This is hell celebrated example like our ideas data set data set or in reinforcement learning you can call it as I say It's more taxi and the next one is like say working in a frozen lake saving yourself when you're watching in frozen lake Frozen lake not meeting his present The realism like say or semi solid if you step onto that one No How do you survive working in that further like that in the second problem will be solving So here Smart taxes So here what has to do is the game in the cyber Agent Asian has to pick up passenger from more like a cup location on here above Passengers should be dropped at designated location for a pick up Now we have to define a reward for a withdrawal Be hard to define a reward for with the other action we have to defend a reward is in your hand You're fine right The task is be hard to pick up a place a passenger Drop him the and they want They basically teach the like taxi To do this for the next one is take care of passenger safety Don't hit any divider or don't eat any What do you say or yeah okay This is the problem They say what some of you are asking Like unknown situation is coming out Door handles as of no hard to actually fall back to your old friend K means state aggregation That's it So the reverse for a very drop for a very proper pick up hot opposite with some big rewards a very proper drop out opposite with big award for a minute here time is important Okay To pick up a new drop next day for like from here to like seek or Mangala like majestic Okay it's also perfect drop but they are losing out time issued penalise So for every other step if they are not reaching the target there should be some time penalty said the salsa reward that may be very marginal for the period of time If that is happening Lissy are actually getting struck So your taxes getting penalised so he doesn't want to do that activity again and again I mean same repetition like he may actually stop at one place on sleep under dis penalised rate So since his finalist he doesn't want to do that activity again But all those Alexa penalties now we have to define seriousness of the penalty Eating the dividers is serious penalty Okay As of know in the simulation environment penalties are acquitted 21 David Um when out there is a penalty it is actually added The reward is negative One winner proper Passenger is picked up the bodies dropped and we can actually redefined those things When we're recording by default these valleys are there so no we are getting the reward The meaning right So for proper pick up big a word for proper drop of big a word roaming around Phillies penalty after picking up If we're not reaching the distance in shorter time for movie A Rachael penalising trying to visualise know how the solution looks like now the hard to say the environment and then discouraged say for example e r In a parking lot he had to visualise your system Okay this is what I say The rectangle I have in front of me let me describe is our rectangle into five by five square The taxi can be in any of this five by five square Okay The drop of drop and a pickup locations could be any anywhere in these things Okay there are four locations It could be drop It could be pick up as well Now basically if it is the pickup location This gay has to directly come here Pick up under drop here any and necessary roaming this year will be penalised on the middle berth Whatever we have these are obstacles the house operative And if a cabbage in the present location verities what it can actually do it can actually go north East south It cannot go west You use like an office So not so much west north west East are useless Up down left right So if it actually takes decide it is serious Negative The report you can redefine And when you are actually penalising the sort of wrong activities this activities will not be repeated in future with high probability We are not penalising Okay This is a lotta mata We can actually keep it against a mistake So this is what the environment in new government is now I have taken the picture I have got the like 55 grid on top of it Now how many states are possible in this one 25 states and there are four locations to be picked up Okay be 25 on this five This four in different of five in 25 and there are four locations you're getting into Tens are No If a person is in the cab that cab location also can actually go to any any of these things right So say for example I'm greedy Federer of sickles right So of Stoeckle Now of course I can actually increase the complexity I can increase the complexity That obstacle is is not in the middle of the road That's what I'm actually parsing It's actually desert roads of secular is not in the middle of the road but like if there are multiple lanes Okay And there is like the divide Dallas is left Lt right Towns are possible Okay Alexa some animal is working That is obstacle It can be in the middle of the road Five agreed And how Four drop for pickup locations that miss totally have like 25 into four rate And then once person is in the cab the passenger is in the cab He can be anywhere in this location He is big up here No camp Location also is important rate Cabbies in 0001020 Tree cab location unity Keep track of it And it is actually moving in one dimension In one diversity hall I say five grids You already have one like X axis The cab is access It can help Five values Let Mrs how it is Say for example my what I say I'm in Jersey Reallocation Let Mr the least one left hand Side bottom is 00 Fine And then the person who is like say whether passenger is there are not not passionate Pick up location is there are not zero or one Let me say his desk rate And then here it could be Cab Cab is there are not This are be on location locations Passenger can get into the getting to cap at these locations Get out off for cabin at these locations This is one This is one This is one and this is one These are not in my next one It will be clear So now he actually tried to say for example If it is here I am here The cab can be here are here So when I am here capitation I'm looking at when I am here What is a cab Location Cab location cab location cab location Okay expertly wanted the Y Value values Cab location I defy into 25 Because this for the location for locations 24 pieces will be There are not the choices And Gerard one Okay Forces like cab can go There are not other person is there are not say I am here X axis Okay now I am action looking at the Y axis where the cab cab could be at 00 cap Could be at 01 Your acceptance is fixed You are fixing X axis Very would like cabin of access Okay They're like the independent Release a five I am saying I say when I am in this place Where the cab Okay whether it is in German 011012013 are not a Sarno Five valleys of possible next one Okay I'm fixing the x axis actually for cap 25 valleys of possible in one of these 25 locations It can be there Okay now say for example if I six xx is at 00 on accepted zero whether it is there now I have only five by values Mary Okay So that's why I am taking like five So in fact I can actually go a little more complex I can actually defend Likes a lot more things Okay This is a simplest grid I can define simplistically I can defend from here on words I can actually say Like one dimension is this is like X Why This is like the passenger location Okay And this could be taxi location right Four dimensions Agree with me No I can actually get into like save this to make it more spicy I can isolate sale Except this cab can be in any of these things of five independent variables Five independent degrees Let Mr 25 degrees Okay I'm actually increasing the complexity Okay let me said now I have only this ministates home ministers That will come Simple problem 500 states And in fact here we actually reduced the complexity if you actually go for the complex And I said yes I want to Basically I don't want to fish this location Also I want to basically get like say all the coordinates without any aggregation LG It's not 502,500 states or that action is not defined Actually you can actually defined legs because there is no law and discipline of secular is actually in the middle not in the middle of the road It is actually in the side Yeah we can actually defeat lot of actions but we need to change the Dallas eco system In that case say for example in the simulation mode this is of it will be the a Lebanese job The 11 is the cab in a jumping all over the place for everything Whenever it actually comes to or like say the pick up or drop location The reward is then you can redefine other ways Everybody is like say minus one If you are not at one of these four locations reward is minus Want If there is one of these four locations let's say there is a 5.5 rate Yes the passenger is at location one Yes the cab is at another location is so we are looking at the possible five by five makes and sidle The cab is at the place where the passenger standing can be in any of the other 24 positions Okay so the total thing this fine by five I want to print this particular scenario So where the cab is that Where is the passengers I'll come to the away The cabbie's let me say that is on index 11 This 13 Xs one the in any convention is coming from bottom to top our top to bottom I have 11 and then hold a different camp I am I say this is the location Okay once I have this or location No cab location Is this okay The desert abdication I have to not offend the passenger location If passenger is inside the cab if passenger is inside the cab okay And then how do I actually not give the notation Gerard one Whether he is inside the cab or he is not inside the cab Right The destination Now the destination could be any of the 24 Because if that one cover of under cab is their place take this case one Come on The curve is there The passenger is also they're the same one Cover one Come of 11 indicates the passenger is in the camp Okay now and now the cap can go to any of the places now instead of all the before You said there are only for other application giving Limited limited The destinations are starting positions So then we can indicate that the destination is in any of those four locations that Mrs you have like five by five This is great These are four locations Four locations Okay on now once Like cash injuries inside the job No this is actually just location four locations Okay once passenger is inside the cab okay Year have fixed one dimension He is inside the cab He has to go from here to here of here to hear right So your X axis one damages are fixing It is something like when you say you are extra next to Yeah India writing now Hyper plane X one place on Let Mr Expanding Coastal Extra Policy have fixed Expert What is expand Forgiven Extra It's hyper plane Once he picked up passengers were ever going from Decide to the side decide to the side only 55 positions are possible You're fixing one Say for example if I pick up the passenger here it can go decide five locations on this Certify locations 25 positions are possible He said that 5% and giving you 24 positions of passport He can go left right Those left down up down left Left right up down Those things are actually we are getting from here Okay The thing What We actually have to see his maximum number of states persist Water The minimum number of tears possible What are the minimum number of stress possible If you actually look at this hyper plane X 26 to 2 dimensions five by five Okay No we are say like expanding Costa extra policy That is a hyper plane Okay what is the hyper pain dimension It is one he started with to dimension X X two hyper plane in any dimension is one minus one If dimension of your spaces in hyper print will be in Anna and minus one know why it is actually coming out to be five is if you picked up here the maximum number of possibilities can actually move this side decide Picked up location is one and rest of the four possibilities of four places 1234 Of course it actually hits obstacle So it has to come in this phase and coming this side the maximum by well is possible for this one or 024 Once I fix X that's exactly what happens in hyper playing Yes I can actually worry about all 25 that extra likes it duplicate feature I can add And in that case I can actually end up with 2500 states 2500 states Island up it Okay I don't need to have those 2500 states with just 500 states I can actually replicate That's what I mean by okay If I have this 500 states what will happen Okay now this is what life's a simulation It actually cabinets actual moving in all directions all possible directions Okay On respectively Like say if you are explicating other reward our penalty I'm in negatively bodies Penalty Nixon's So So when are we are actually moving from one location to another one This is what your queue table is getting updated with Okay enough You start irrefutable Jarabulus right And then for every episode and designated time a number of for time steps actually start updating your table with the formula Whatever we actually started with the of the city cause Tokyo for the place You're learning rate with all that masala it under sort like this particular use cases No Let us go to the what do you say The simulation and see how it exactly works Can we please open this notebook and by the West If you want to implement this one on your own without any framework it will consume time But it's not too difficult Cheque up to pick up and drop Passenger said their lives right locations with Okay so we're not clearly saying Like say this is pick up on this is our drop Okay What we are actually two in total is when every touch bases with one of these The data points I mean one of these locations awarding In fact this is not a full fledged or implementation I should call when Nuri actually pick up Only drop is possible We had to actually get all those logics incorporated No So you can like say instal the volume using of the pistol between stall Jim The spot actually did rate and it works it of by country It works very well Python 3.500 Book on Fight on to Also it works but I hardly used fightin to And Jim Now I am loading this taxi in your arraignment when I logged this tax and your government It is actually giving me the states and actions Description What an all states What general actions for Donald reports I mentioned Okay that is the information I'm actually getting Okay so for some of the problems no actually it is the protective people started off loading Even this so human I'd rober working that environment Also people actually uploaded know Jim And then So I am visiting it to a particular Lexie random step because cab can be any of the 25 locations so I am descending into a particular location Okay By the way when you actually look at these the states the states they are not not a printed with coordinates they are actually given numbers 0124 99 Okay so the state what we are explicating is 201 or that was a layer one Okay Now cabbies at the third location Number of four elements in observation space Number of states 500 Okay on Know if actually see the state This three state is actually coming out here Okay three State is offended like this I was actually one step ahead I was actually reading about previous one Okay No for this the state is three In a way We already rendered this I'll removed the possible actions are zero down one of right to left Three pick up for drop Five 012 five Totally six actions possible So now if you see number of actions possible six and number of states 100 Let me take Lexie Let me state too 114 Okay One The state would look like the cabins and dislocation Okay No When actually of door they say What did the next eight when my presence That is one What is the next date Okay action It is not my my present status Three I started with President City Coastal three for forming action 10125 I am for forming the action one Then what is the resulting state The reward The destination No we are not based And then the other things like the probability 1.0 Those are like say we're always for the debugging as of no they don't carry much information You can actually start adding you were brought debugging information there Okay Say for example if I say action Eco's too five Okay action cause to four minus 100 Asari minus tan So we are seriously penalising Probably hit the world with this action rate So action Icos two for the 1st 2 pick up Okay that was not possible So you're basically or missing something there No How good does the behaving randomly Help for example I don't get into the queue learning percent I have like state and I'm putting the counter and I have a reward The AM emptied it Okay I started point location Let me put some more 1010 steps 1000 steps on See whether I will be able to reach the destination on what would be the river like reward to reach the destination pick a pond and drop Okay in home Any time steps I am setting toy election Max 1000 in Home Minister um steps will be actually able to pick up the passenger and drop him And what is my reward So just wind up red is unify to reform I you you be setting those When we loaded the tax free environment all these things have reloaded We have to write it on our own So what I'm saying is unless rewarded Coastal 20 on actually keep on going This one okay till I actually get the final destination Okay In home Minister Syed A solve this one in 562 steps amusing while Look So there is no limit on the steps but in general for the next if this even am doing it took 2126 steps to pick up a passenger and drop him Okay And now let us get into Q learning way of doing things so I can actually so winner you are not actually dropping the passengers Reward is not 20 When you are speaking of the passenger it is taking rewarded When are hitting the world You would reward is minus tan So when you're hitting the wordy coastal 20 that monsieur abusing the destination Okay I shall click that those three quickly the like Everybody question 20 I am saying like whether he had dropped it or not properly to setting into zero zero's And then I'm saying like the number of sorts let me actually run only once So keeping it likes a reward The accumulated G Costa zero and then learning pyramid in all five coast 2.6 some number and starting with okay And then you see here my Garma the discount parameter I'm taking it as one in my education only all five I am ready wearing other ways This is the same equation What we are actually having right There is no deviation So for episode I am saying like say well rewarded not a coast to 20 Take the action You're actually this is the state Take the action Best action from the table And then you're saying like say environment not step of action Distraction And for that bit destruction What For the reward you are repeating it till your devotees 20 again is by lucre Can actually put time storm 1000 times 1000 times steps counter Great So by insist it is 1 87 on final status we print their funds for 90 for 75 What The reward total The world minus five on it is very bad that Miss I took lot of time Okay I took a lot of time I actually trained the system only once Episode E questo one I did not train multiple times I trained only once If Saudi Coast one once the passenger was picked up and he was dropped That's it So what I am actually now going to do is I am going to set if swords Okay now we started with minus 77 reward And finally we are likely model unstable with like say reward value Okay now we got our suitable prepared cute capable prepared We actually run 2000 wicket rations Okay No letter See whether we are able to pick up and drop Okay So what we are doing is I'm picking up stated coastal environment dot Okay Whether we are done No we are saying like she not done done Not e questo True action in questo The best action of the given state Okay And using this action now your environment is giving the next eight Okay next date the respective reward whether we have reached the destination or not done or not Okay Earlier we were actually comparing rewarded question 20 devotee Question 20 are done in Costa True You're done or not So he said the system and then noisy cab is here Okay It is asked to actually move Not more north There is it can it could not move It could not go a poured under the next one Is it Mood picked up Okay And then the action you're at the place Picked up the passengers Action Okay And after that it is asking like said to move south Okay And then it is asking to move west west no south south and then finally drop off Okay you're actually defending the environment And seven actually trained with 101 episode He was not actually it did not actually for from So Well of course we did not tested in this phase but as soon as he actually did door the beautiful learning The cab is a will to actually do it sub and it looks very very simple But if you see the complexity that defending the number of states each state now we have to select a water the possible actions on based on that now we have to defend the reward rate whether we want to go ahead with 500 states Are you want to go ahead with 2 25 100 states and then you basically defending What towards based on the if so also category of states reward is minus one sewn for category of states rewarded Kohstuh biggest Okay now let me a clear little more of a spice to this one So we have this fund here right Dani question not a coastal True What I try to do is if done Arctic coastal troops on rewarded Costa zero ar Ramadi close to minus one Okay I want to basically finalise it to higher number if the world call too Like save minus one rewarding quiz to minus 100 I am penalising a lot okay for not doing the activity correctly I am rewriting this one Okay so let me get this queue table again So I'm insulating you Table 20 If the source I'm setting into two Italian Sea initially started doing badly and over the period of time it actually started learning Okay no less Mrs What in the initial state now as it is doing very well it actually started with This is essential pick up intial of picked up and then immediately moved up said Because my penalty is I am actually seriously penalising the way for doing the for not doing the activity Now say for example instead of 202,000 episodes let me set absurd request 100 because number of officers then you're increasing It is expensive Compositionally that miss it if source e questo 100 literacy Hawgood It will be now since it won't the straight away We were actually very happy now is it year here And it is asking to go west on But it cannot go west because there is ball there When you're here you cannot actually suggest west There is evolve So it is standing there Only an e c on those actions have repeated repeated infinite loop No it could not solve the problem Even after licences 100 episodes It actually guard into a loop it could not solve Let me actually penalise Lis See what happens are in fact the state Maybe I contain the state With 100 episodes we are not able to get the solve the problem It is still there only at me We are not able to actually converge when a hard number of every source e questo small No it's not able to actually solve the problem It is getting stuck at some place now left me stuff 100 That Mrs 1000 Weather we will be able to do Yeah life Any of these four locations starting one If you visit one as a pick up it can visit any of this four locations that will be pick up any of this four locations after that Okay Once Once he pick up it has to go to other locations for the drop And where are those obstacles put in here Sorry Mr Part No In the court of where did we mentioned those Okay so that reports say for example let me Actually I understand your question Okay I actually got slightly deviated the obstacles and we actually mentioned we are talking about states Right Venom in a particular state here like Sylvester is not possible Where did women shingles the environment We loaded the environment Be loaded Okay That is where we mentioned we don't We don't mention the environment already mentioned and gave us So this is like a prion government loaded here When you take East Action State will be there only But you're getting finalist Okay well I say we were actually hitting western west They're only are standing So when he said taxi iPhone we told that's environment with those already loaded in Exactly And now as a data scientist what is our valued or valued is creating these environments solving Once we have an environment I mean to say one See how states properly defined the words defined actions Defined solving is three learning That's it Very simple Right But preparing this one preparing that environment is like a problem dependent water environment I prepared for my problem You may not be able to use it Okay that is the challenge Be hard to limited in a sea of the data in the new government Looks like we are in this problem Yes we can actually go to the likes of open the open a gym Complete source code is available Yeah be actually loaded from Open J year Open a gym Say for example Opener Jim get hub All the source code amid think is available Okay Completing his open source And now people actually What if you want to redefine s if you want to defend your own in Iran Mint in love's What people are doing is the existing environment is adopted like the taxi environment I take and then new modifications for my my own data Okay because already like say how states and action states actions and rivers to be combined is already defined in this one So let me take that as a template and can do it for my problem So he certainly want to use framework This framework he had to define you had to modify it that way But the best way is right or on no of what it states and actions outside that will have for better control We hope you like this complete tutorial on machine learning and Piper Great learning offers high quality impactful and industries relevant programmes to working professionals like you Our faculty polls comprises of leading academicians and industry of practitioners in the field of Data Analytics For more information checked the links in the description down the law don't forget to like share and subscribe Remember the only learning that matters is great learning Siona next

test attribution text

Add Comment