Engineering Challenges
đ Welcome!
Â
You’ve landed here because we’re considering your profile for our team.
This is designed to give you a better idea of what to expect from working at Veridion and dealing with the kind of problems we're investing our time with. The challenges we prepared will allow you to showcase how you think, solve problems, and approach real-world challenges, and also give you a glimpse into the types of tasks you will be facing in your role here.
The way we work
The thought process behind our problem-solving approach

Solve hard, useful, and unresolved problems
The tech we use enables novel solutions, and as such, we tackle the hard stuff that others wonât, and focus on problems that havenât been solved yet.

Creativity over convention
We donât follow templates. Most problems we face have no prior solution, so we have to come up with our own ways of getting things done.

Prioritize solving the problem
We find the most effective way to solve the problem, not just using a specific tool or framework. We employ whatever technique necessary without imposing artificial limitations.

Deliver real-world value
We build solutions that genuinely work and provide actual value, effectively balancing development time, accuracy, and speed.

Exponential growth
We scale by continuously learning, iterating, and pushing boundaries for both our product and our team.
What makes a great solution
Hereâs what we expect from a remarkable project:
The way you deliver this project reflects your work ethic and how you tackle problems from start to finish. So, please ensure your project is as âproduction-readyâ as possible. If youâre not ready to give it your best, itâs probably not worth doing at all. Itâs pointless to submit 40 lines of code and call it a solution to one of the challenges below.
The goal is to show that you can make an impact and approach problems with the right mindset, just as you would as part of our team. While we donât expect you to have all the answers right now, we do expect you to demonstrate a strong work ethic, the potential for growth, and the ability to tackle complex problems effectively. Because if you donât prove that, we canât afford to take you seriously as a potential team member. If hired, your effort during the recruiting process will be rewarded with a bonusâthis is not about free work.Â

Correctness
Does your solution meet all the requirements and constraints given in the challenge?

Robustness
How well does your solution handle unexpected input or edge cases?

Code quality
Is the code well-organized, readable, and maintainable?

Extra mile
Does your solution only address the bare minimum or does it go beyond the surface?

Presentation
How well does the presentation reflect your reasoning, why you made certain decisions and the thought process behind your solution?
The builder mindset
What if we told you thereâs a straightforward way to significantly increase your chances of securing a role in Veridion? You might be skeptical, but this is it:

Take your time to understand the problem, but donât expect to have endless time for perfection. All these challenges are solved by using code. But the more important component is the real understanding of the problem by understanding the data youâre working with.
Candidates who succeed in solving these challenges are those who focus on building solutions that add real value, rather than just writing abstract code disconnected from the problem theyâre solving.

If you expect to succeed without prior or current effort, you may need to reconsider your approach. If you can solve any of these problems well, itâs because you either:
- have invested effort in the past, building your skills to tackle unknowns and difficult challenges.
- are willing to invest the effort now, working diligently to raise your skill-level.

This is a complex task, it takes an experienced candidate at least a few hours to do it. For someone less senior, it takes more. If it takes you only two hours, itâs probably because you havenât invested enough time in investigating and understanding the problem you''re trying to solve.
Donât let the complexity discourage you. The requirements are designed to test your ability to deeply understand the problem and conduct a bit of research before jumping into writing code.

Donât limit yourself to methods others have used to solve similar problems in the past. Thereâs a good chance you can find a better solution if you allow yourself to be creative and invest a bit more time to study the problem from multiple perspectives.

Anticipate that it will be hard, uncertain, and occasionally filled with confusing or frustrating moments, which is what builds resilience and makes a difference in the long run.

We don't like wasting (anyone's) time. And so, these are not some random tasks designed to test the ability to write some out of context code just to see some syntax before the interview.
Consider why
any of these projects would be relevant to the recruitment process for this role and how they relate to what we do at Veridion.

No matter which challenge you choose, you can use any programming language or toolset you're comfortable with. We learn to master new tools every day and we think the reasoning behind decisions is the one that counts the most. You can probably get up to speed with the rest rather quickly.
Choose your track
Check out the challenges below and choose the one that you think youâll do your best at. Theyâre very similar with the type of tasks youâll work on once you join our team.Â
Canât wait to see what you come up with. Enjoy the processâweâre confident youâll kick ass!
Task
Match and group websites by the similarity of their logos.Â
Context
Logos are instrumental for a company’s identity – they’re the symbol that customers use to recognize your brand. Ideally, you’ll want people to instantly connect the sight of your logo with the memory of what your company does â and, more importantly, how it makes them feel.
Guidelines
- Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
- We know this is a clustering problem, you know this is a clustering problem, question is: can you do it without ML algorithms (like DBSCAN or k-means clustering)?
- Check whether the program correctly extracts the logo and matches them properly (as a human, you instantly recognize them, but this is way harder for a machine).
- Explore this from as many different angles as you can. It will generate valuable questions.
- From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Node, Python, Scala).
- At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problemâif your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.
Resources
If you’re ready to jump into the problem, please start with the following list of company websites:
Expected Deliverables
- Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom hereâfeel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
- Output
Out of the given dataset your algorithm should be able to extract logos for more than 97% of them.
Your program should output multiple groups, each containing one or more websites (it could be possible that some logos would be unique to only one website). Make sure to upload your results along with the code. - Code and Logic
Include the code that enabled you to achieve this task for the provided list, and ideally, for any list of any size.
Submit your project
When youâre finished with the challenge, please submit the link to your Github project below.
Task
Build a robust company classifier for a new insurance taxonomy.
Objectives
- Accept a list of companies with associated data:
– Company Description
– Business Tags
– Sector, Category, Niche Classification - Receive a static taxonomy (a list of labels) relevant to the insurance industry.
- Build a solution that accurately classifies these companies, and any similar ones, into one or more labels from the static taxonomy.
- Present your results and demonstrate effectiveness.
Guidelines
Since this is an unsolved problem without a predefined ground truth, youâll need to validate your classifierâs performance through your own methods.
- Analyze strengths and weaknesses:
- Explain where your solution excels and where it may need improvement.
- Discuss scalability and how your solution performs with large datasets.
- Reflect on any assumptions made and unknown factors that could impact your solution.
- Ensure your solution truly addresses the problem
- Focus on solving the actual problem, not just implementing complex algorithms. Using embeddings, zero-shot models, TF-IDF, clustering, or other techniques is meaningless if companies are misclassified due to a flawed approach. A well-designed solution is more important than an impressive algorithm.
- Your evaluation should demonstrate that your solution effectively addresses the problem. Simply plotting similarity scores or reporting F1 and accuracy metrics without meaningful validation only measures alignment with your own heuristic, not real-world effectiveness.
- Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
- Provide insights into your problem-solving process:
- Why you did what you did, what other paths you considered, and especially why you chose not to pursue them.
- At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problemâif your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.
Resources
If you’re ready to jump into the problem, please start with the following files:
- List of Companies: company_list
- Insurance Taxonomy: insurance_taxonomy
Expected Deliverables
Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom hereâfeel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Annotated Input List
Return the input list with a new column titled âinsurance_labelâ where you have correctly classified each company into one or more labels from the insurance taxonomy.
Code and Logic
Include the code that enabled you to achieve this classification for the provided list, and ideally, for any list of any size.
Submit your project
When youâre finished with the challenge, please submit the link to your Github project below.
Task
Identify unique companies and group duplicate records accordingly.
Context
The dataset contains company records imported from multiple systems, leading to duplicate entries with slight variations.
Guidelines
- Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
- The dataset includes extensive company details, but not all fields are necessary for deduplication.
- The key challenge is to identify and leverage the most relevant attributes to accurately detect and group duplicate records.
- Take the time to research and understand what defines a company and which attributes uniquely identify it. This understanding is crucial for accurately detecting and grouping duplicate records.
- At times, incomplete data may require you to make decisions where there is no clear right or wrong choice. What matters is backing each decision with the reasoning behind it.
- It’s essential to document your decisions and the reasoning behind them.
- From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Scala, Java, Python).
- At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problemâif your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.
Resources
If you’re ready to jump into the problem, please start with the following file:
Expected Deliverables
Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom hereâfeel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Output
Return the updated dataset where you have correctly identified unique companies and grouped duplicate records accordingly.
Code and Logic
Include the code that enabled you to achieve the required entity resolution for the provided list.
Submit your project
When youâre finished with the challenge, please submit the link to your Github project below.
Task
The goal is to consolidate duplicates into a single, enriched entry per product, maximizing available information while ensuring uniqueness.
Context
The dataset contains product details extracted from various web pages using LLMs, resulting in duplicate entries where the same product appears across different sources. Each row represents partial attributes of a product.
Guidelines
- Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
- Thoroughly analyze the dataset to understand each attribute clearly.
- There isn’t always a single solution to this problem. Some decisions may be neither strictly right nor wrong, but they should be supported by as many relevant factors as possible.
- It’s essential to document your decisions and the reasoning behind them.
- From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Scala, Java, Python).
- At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problemâif your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.
Resources
If you’re ready to jump into the problem, please start with the following file:
Expected Deliverables
Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom hereâfeel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Output
Return the updated dataset where you have correctly consolidated duplicates into a single, enriched entry per product, maximizing available information while ensuring uniqueness.
Code and Logic
Include the code that enabled you to achieve the required product deduplication for the provided list.
Submit your project
When youâre finished with the challenge, please submit the link to your Github project below.
Task
Design an algorithm that will group together HTML documents which are similar from the perspective of a user who opens them in a web browser.
Guidelines
- The dataset contains 4 subdirectories in increasing complexity.
- Your algorithm should work reasonably well on all the âdatasetsâ with no specific optimizations / implementations for one of them.
- Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
- Explore this from as many different angles as you can. It will generate valuable questions.
- From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Node, Python, Scala).
- At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problemâif your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.
Resources
If you’re ready to jump into the problem, please start with the following list of company websites:
Expected Deliverables
- Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom hereâfeel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
- Output
Your program should take one subdirectory at a time and output the grouped documents, something along the lines of: [A.html, B.html], [C.html], [D.html, E.html, F.html] … . - Code and Logic
Include the code that enabled you to achieve this task for the provided list, and ideally, for any list of any size.
Submit your project
When youâre finished with the challenge, please submit the link to your Github project below.
Canât wait to see what you come up with. Enjoy the processâweâre confident youâll kick ass!