MyGov is organising an Online Challenge for Developing a Predictive Model in GST. The deadline for submission is September 26, 2024.
About the Hackathon
The purpose of this Hackathon is to engage Indian students, researchers, and innovators in developing advanced, data-driven AI and ML solutions based on given data set. Participants will have access to a comprehensive data set containing approximately 900,000 records, each with around 21 attributes and target variables. This data is anonymized, meticulously labeled, and includes training, testing, and a non-validated subset reserved specifically for final evaluations by the GSTN.
Participants are encouraged to use this dataset to design and implement innovative artificial intelligence (AI) and machine learning (ML) algorithms to tackle the stated challenge.
Additionally, this initiative aims to foster collaboration between academia and industry professionals, driving the development of effective and insightful solutions that strengthen the GST analytics framework.

Eligibility
Indian students or researchers associated with educational institutions, or working professionals associated with Indian startups and companies can participate in the Hackathon. The participant must be the citizen of India.
Registration Procedure
All participants must register atΒ Janparichay. A registered user can directly login atΒ this link and submit required details to participate in the Hackathon. It is expected that the participants would submit accurate and up-to-date details and they have to confirm this before submission.
Steps for Login and Registration:
- Access challenge Page at this link.
- Click on βLogin to Participateβ.
- User is redirected on the βJanparichayβ site. Participant can login using credentials in following ways:
- Username β Participant can login with username and password.
- Mobile β Participant can login with mobile and password.
- Others β Participant can login with email id and password.
- After login, user is redirected on the event site from Janparichay.
- New User Login -> Participants who are new to Janparichay, have to register first on Janparichay.
- Janparichay account primarily takes mobile number in registration process.
- It is advised that participants update their email id in janparichay account before proceeding to event site. Steps for doing so are mentioned below β
- Step 1 β After login on Janparichay site. Go to edit profile page.
- Step 2 β In the βVERIFICATION DETAILSβ, select βPrimary Email Idβ in the βSelect Verification Parametersβ dropdown.
- Step 3 β Enter email id in the text field and click on βVerifyβ.
- Step 4 β Fill in the OTP sent to the mentioned email id and click on βSubmitβ.
- Step 5 β Logout from this service and re-login via mobile number or emailid to access the same.
- Old Janparichay User with no email id in Janparichay account -> These participants are advised to update their email id in Janparichay account first. Steps for doing so are mentioned above.
Structure of the Hackathon
- The Hackathon would be organised as an online event with processes for registration of participants, accessing the datasets to be utilized for each problem statement, and submission of developed prototypes. There would be an offline event with the shortlist participants for the finale/second round.
- Indian students or researchers associated with educational institutions, or working professionals associated with Indian startups and companies can participate in the Hackathon. The participant must be the citizen of India.
- The participants are expected to form teams of up to five members including at least one team lead. A participant may only register as a member of a single team.
- The Hackathon would take place over 45 days from the start of registration to the final date for submission of developed prototypes.
- Participants would receive a dataset containing 9 lakh records with around 21 attributes each. The data is anonymized and labelled, including trained, validated, and non-validated datasets.
- Before submission of solution prototype, participants have to upload their code in GIT repository and an optional demo/product video on YouTube.
- For online submissions, following required/optional fields are to be shared for evaluation:
- Idea/Concept
- Project Description
- Source Code URL (github.com)
- Video URL
- GitHub Unique Source Code Checksum β Steps to create checksum are mentioned in later steps.
- Project Report
- The evaluation process of the Hackathon would be overseen by a distinguished panel of jury members comprising experts from the fields of machine learning, data science, and tax administration. The jury would rigorously assess each submission based on predefined criteria to ensure a fair and comprehensive evaluation.
Problem Statement
Given a dataset D, which consists of:
Dtrain A matrix of dimension R(mΓn) representing the training data.
Dtest A matrix of dimension R(m1Γn) representing the test data.
We have also provided corresponding target variable Ytrain matrix dimension of R(mΓ1) and
Ytest with matrix dimension of R(m1Γ1).
The objective is to construct a predictive model FΞΈ(X)β Ypred that accurately estimates the target variable Y{i} for new, unseen inputs X{i}
Steps:
- Model Construction:
Define a predictive function FΞΈ(X) parameterized by ΞΈ that maps input features X to predicted outputs Ypred.
The model FΞΈ(X) should be designed to capture the relationship between the input features and the target variable effectively.
2. Training:
Optimize the model parameters ΞΈ by minimizing a loss function L(Y,FΞΈ(X)) using the training data Dtrain
Consider incorporating feature transformations, feature engineering, or feature selection to enhance the modelβs predictive performance.
3. Testing:
Apply the learned model FΞΈ *(X) (with optimized parameters πβ) to the test data Dtest to generate predictions Ypred for each input Xjβ{X1,X2,β¦,Xm1}.
4. Performance Optimization:
Evaluate the modelβs performance by calculating accuracy or other relevant metrics M on the test predictions Ypred_test.
Refine the model by iteratively adjusting ΞΈ or modifying FΞΈ(X) to improve performance on the chosen evaluation metrics M.
5. Submission:
Present the predicted outputs Ypred_test along with a detailed report that includes:
-
- The modeling approach employed(Properly commented Codes, supporting citations etc).
- The metrics used for evaluation.
- Key performance indicators as per the defined metrics for the hackathon.
** Kindly refer βSubmission and Expectationβ page before submitting your solutions.
Tech Stack for Building Algorithm
- Participants are encouraged to innovate by developing their unique functions (f(x)) to tackle the given challenge.
- Participants have the liberty to utilize any tech stack of their preference for model development. This flexibility allows them to harness the tools and technologies they are most adept at, facilitating the creation of effective and inventive solutions and deriving the mathematical function for this Hackathon.
- Participants are encouraged to explore and experiment with diverse ensemble techniques, blending different machine learning algorithms to enhance performance and attain optimal results on test data.
Prizes
The Hackathon offers significant prizes for the top-performing teams, and these are:
- First Prize: Rs. 25 lakhs
- Second Prize: Rs. 12 lakhs
- Third Prize: Rs. 7 lakhs
- Special Prize of Rs. 5 lakhs for All-Women TeamsΒ (in addition to the top three prizes)
- Prizes would only be awarded if the model created meets the juryβs satisfaction of usability of the designed solution as a viable product.
- Consolation prizes of Rs. 3 lakh, Rs. 2 lakh, Rs. 1.5 lakh and Rs. 1 lakh would be given in lieu of announced prizes, if the jury does not find any model provide perfect solution of the problem statement.
How to Register?
Interested participants can register through this link.
Deadline
The deadline for submission is September 26, 2024.





