Lecture Note
Is there a connection between a vehicle's price and its drivesystem type? Whether a vehicle has a front-, rear-, or four-wheel drive, the kind of drivesystem is one of the most crucial aspects that buyers take into account. Butdoes a vehicle's pricing depend on its driving system? And if so, what kindof drive system increases a car's value the most? We employ data analysis to get the answers to these questions, groupingand comparing the price information of various drive system kinds usingPython's Pandas package. Data Groups by Method Pandas is an e ective tool for data analysis that enables us to categorizedata. We can group our data by the various drive system types and comparethe average costs by using the "group_by" technique. Let's assume that our goal is to determine the average cost of vehicles andsee how they vary depending on the type of drive system. The three datacolumns we are most interested in—price, driving system, and bodystyle—are initially chosen. Using the following piece of code will enable usto do this: The next line of code is then used to categorize the reduced data accordingto the driving system and body style: In order to determine how the average price varies among the various drivesystems and body shapes, we take the mean of each group.
As a result, we have divided our data into subcategories and are onlydisplaying the average price for each item. We can see thatfour-wheel-drive hatchbacks have the lowest value, while rear-wheel-drive hardtops and convertibles have the highest values basedon our data. Pivot Method While a table is helpful, it can be di cult to comprehend and e cientlyvisualize the data. Using the pivot method, we can change our table into apivot table, which is simpler to comprehend. One variable is presented along the columns of a pivot table, and the othervariable is displayed along the rows. We may reposition the body stylevariable so that it appears along the columns and the drive systems appearalong the rows by using the following line of code:
Now that the pricing data is organized into a rectangular grid, it is muchsimpler to see. This is comparable to how Excel spreadsheets are typicallyused. Heatmap The pivot table can also be shown using a heat map visualization. A heatmap assigns a color intensity to a rectangular grid of data dependent on thedata value at each grid point. It is an excellent technique to plot the goalvariable over a number of other variables and receive visual cues about howthese variables relate to the target. Using pyplot's pcolor function and the red-blue color scheme, we plot theheat map. Each body style is represented on the x-axis of the output plot,while each driving system is represented on the y-axis. Depending on theirvalues, the average prices are plotted in a variety of hues. The color bar indicates that the heat map's top section appears to havegreater pricing than its bottom section. This suggests that compared to
other car styles, rear-wheel-drive hardtops and convertibles are typicallymore expensive. Conclusion We can organize and compare the price data of various drive system kindsusing Python's Pandas package. Our data reveals that the most valuableconvertibles and hardtops are rear-wheel-drive models.
GroupBy in Python
Please or to post comments