A platform that eases the work of data and business analysts in generating inferences from data without having knowledge of the coding side of things. It will provide complete data handling capabilities from ETL(Extract, Transform and Load) pipelines needed to build the dataset, automated data profiling and cleaning, analyzing how the data changes over time, improving the data quality. As the next step, neural networks would be automatically generated to perform classification or regression tasks on any target variable from the dataset. The platform will also create and suggest beautiful visualizations for the given dataset that can help drive decisions and understand the data at hand better.
In the context of the data visualisation module the main challenge was to select an approach which provided us with the optimum results when ranking all the plots. We came across two of such algorithms, “Partial order” and “Learning to rank”. We tested both of them on various datasets and concluded that the former produced better results.
Technologies used
Discussion